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TITLE 

PLANT GENE FOR />-HYDROXYPHENYLPYRUVATE DIOXYGENASE 

FIELD OF THE INVENTION 
This invention relates to the isolation and modification of nucleic acid 
5 encoding /?-hydroxyphcnylpyruvate dioxycenase enzyme from plants. These 
nucleic acid sequences were used to establish methods of identification of new 
herbicidal compounds that inhibit the activity of this enzyme, and to prepare new 
crop plants that are tolerant to the herbicidal action of inhibitors this enzyme. 
Chimeric genes comprising nucleic acid fragments containing all or part of the 

10 nucleic acid sequences encoding p-hydroxyphenylpyruvate dioxygenase may be 
used to produce active plant />hydroxyphcnylpyruvate dioxygenase enzyme in 
microorganisms, and to cause the production of modified forms of the enzyme in 
plants that may render such plants tolerant to inhibitors of the enzyme. 

BACKGROUND OF THE INVENTION 

1 5 Bleaching herbicides affect plant chloroplasts by decreasing their 

chlorophyll and carotenoid content. Several bleaching herbicides are known to 
inhibit the enzyme phytoene desaturase. resulting in the accumulation of phytoene 
in treated plants. However, compounds of the benzoyl cyclohcxane-L3-dione 
type cause the accumulation of phytoene in plants but are not inhibitors of 

20 phytoene desaturase in vitro (Sandmann, G.. et al. (1990) Festic. Sci. 30:353-355). 
Subsequent work revealed that these compounds are effective inhibitors of 
p-hydroxyphenylpyruvate dioxygenase (/?-hydroxyphenylpyruvate:oxygen 
oxidoreductase EC 1 . 1 3 . 1 1 .27), a key enzyme in the biosynthesis of 
plastoquinones and tocopherols (Schulz, A., et al. (1993) FEBS Leu, 

25 318; 162-166). Based on the observation that phytoene desaturase requires a 
quinone as an electron acceptor, these authors postulated that by inhibiting 
/?-hydroxyphenylpyruvate dioxygenase. these herbicides act indirectly on 
ph>^oene desaturase by blocking the biosynthesis of quinones. 

The proposal that ;?-hydroxyphenylpyruvate dioxygenase is essential for 

30 carotenoid biosynthesis has received support from genetic studies in the plant 

model sysiQxn Arabidopsis (haliana. Mutations in the pdsl and pds2 genetic loci 
result in mutant plants that accumulate phytoene. However, genetic mapping of 
these mutant genes indicates that they do not correspond to the gene encoding the 
enzyme phytoene desaturase. The pdsl mutation can be rescued by homogentisic 

35 acid, the substrate of p-hydroxyphenylpyruvate dioxygenase. Therefore, this 
mutation corresponds to a defect in the activity of /7-hydroxyphenylpyruvaie 
dioxygenase (Norris, S. R., et al. (1995) Plant Cell 7:2139-2149). 
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In light of these disclosures. />hydroxyphenylpyruvate dioxygenase is a 
promising new target for new herbicidal compounds. Research aimed at 
discovering new herbicides based on this mode of action would be greatly 
facilitated by the isolation of the plant gene encoding this enzyme and by the 
5 functional expression of this gene in transgenic organisms. For example, active 
enzyme produced in recombinant microorganisms could be used to establish 
screening methods for the identification of novel active compounds and to obtain 
stiiicturai and mechanistic information useful to guide ilirther chemical synthesis. 
Furthermore, isolation of this gene would facilitate research aimed at generating 

1 0 mutant, herbicide-tolerant versions of the enzyme that may confer herbicide 
resistance to transgenic plants. 

A partial sequence of an Arahidopsis thalicma cDNA with homology to 
corresponding mammalian sequences encoding /;-hydroxyphenylpyruvate 
dioxygenase has been identified (GenBank Accession No. T20952). but this 

15 truncated sequence is insufficient to identify an active plant />hydro\vphenyl- 
pyruvate dioxygenase. WO 96/38567 A2 addresses the utility that would be 
attached to a DNA sequence of a/7-hydroxyphenyipyruvate dioxygenase gene, but 
there is no biochemical evidence of function associated with the sequences 
disclosed. 

20 SUMMARY OF THE INVFNTION 

This invention pertains to the isolation and characterization of nucleic acid 
fragments encoding plant /j-hydroxyphenyipyruvate dioxygenase enzymes. More 
specifically, this invention pertains to isolated nucleic acid fragments encoding the 
p-hydroxyphenylpyruvate dioxygenase enzymes from Arabidopsis ihaliana and 

25 Zca mays. 

This invention also pertains to the production of active plant /?-hydroxy- 
phenylpyruvate dioxygenase enzyme in £. coli. In one embodiment, a chimeric 
gene comprising a nucleic acid fragment encoding a polypeptide that possesses 
/?-hydroxyphenylpyruvate dioxygenase activity, operably linked to regulatory 
30 sequences that direct gene expression in £. coli, is claimed. In another 

embodiment, a pla.smid vector comprising said chimeric gene is disclosed. In yet 
another embodiment, a transformed E. coli comprising a chimeric gene consisting 
of a nucleic acid fragment encoding a polypeptide that possesses /^-hydroxy- 
phenyl pyruvate dioxygenase activity is disclosed. 

This invention also pertains to a method of identifying substances that 
inhibit the rate of the reaction of /?-hydroxyphenylpyruvate dioxygenase enzyme. 
In one embodiment, the invention pertains to an assay for the detection of 
inhibitors of /7-hydroxyphenyipyruvate dioxygenase wherein a polypeptide 
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derived from a transformed E. coli that displays p-hydroxyphenylpyruvate 
dioxygenase activity is incubated in the presence of a lest substance. Following 
incubation, /p-hydroxyphenylpyruvate dioxygenase enzymatic activity is measured 
wherein a reduction of enzymatic activity is indicative of the inhibitory capacity 
5 of the test substance. Enzymatic activity can be measured by any appropriate 
means, including but not limited to oxygen utilization, carbon dioxide release, 
homogentisale production, and loss of /^-hydroxyphenylpyruvate. Results are 
quantified by radiometric, colorimctric or chromatographic means. 

In another embodiment, this invention pertains to plants that are 

1 0 substantially tolerant to the application of at least one compound that inhibits the 
rate of the reaction of /?-hydroxyphenylpyruvate dioxygenase. Plants may be 
rendered tolerant by overexpression of the wild-type p-hydroxyphenylpyruvaie 
dioxygenase, by expression of a naturally-occuring resistant variant of this 
enzyme, or by expression of an altered form of /7-hydroxyphenylpyruvate 

1 5 dioxygenase that is resistant to the action of compounds that are inhibitory lo the 
wild-type enzyme. 

A further embodiment of the invention is an isolated nucleic acid fragment 
comprising a member selected from the group consisting of 

(a) an isolated nucleic acid fragment as set forth in SEQ ID NO: 1 6; 
20 (b) an isolated nucleic acid fragment that is essentially similar to an 

isolated nucleic acid fragment as set forth in SEQ ID NO: 16: 
and 

(c) an isolated nucleic acid fragment that is complementary to (a) or 
(b). 

25 

BRIEF DESCRIPTION OF THE 
DRAWINGS AND SEQUENCE DESCRIPTIONS 
The invention can be more fully understood from the following detailed 
description and the accompanying drawings and the sequence descriptions which 
30 form a part of this application. 

Figure 1 presents a partial nucleic acid sequence of an expressed sequence 
tag (EST) bearing GenBank Accession No. T92052 obtained from an Arabidopsis 
thaliana cDNA library. This sequence was contained in clone 91B13T7 of the 
library. 

35 Figure 2 presents the nucleic acid sequence of the cloned cDNA encoding a 

full-length form Arabidopsis thaliana p-hydroxyphenylpyruvate dioxygenase 
enzyme, as it was initially determined (SEQ ID NO:2). Translation start and stop 
codons are underlined. Selected restriction sites are indicated. 



3 



r 



10 



wo 97/49816 PCTAJS97/ 11295 

Figure 3 presents the amino acid sequence comparison between full-length 
p-hydroxyphenylpyruvate dioxygenases from Arabidopsis thaliana (SEQ ID 
NO: 1 5) and Zea mays (SEQ ID NO: I 1 ) and the /^-hydroxyphenylpyruvate 
dioxygenase en2\'mes derived from human (SEQ ID N0:6, GenBank Acc. 
5 No. U29895), pig (SEQ ID NO;7. GenBank Acc. No. DI3390.\ mouse (SEQ ID 
NO:8, GenBank Acc. No. D29987) and rat (SEQ ID NO:9, GenBank Acc. 
No. Ml 8405). Asterisks indicate amino acid residues that are conser\'ed across all 
six species. This figure was created using the Pileup program of GCG (Program 
Manual for the Wisconsin Package, Version 9.0-OpenVMS. December 1996, 
Genetics Computer Group, 575 Science Drive, Madison. WI, USA 5371 1 ). 

Figure 4 is a diagram describing the construction of the intermediate 
plasmid vector pT7BlueR + PDOl . 

Figure 5 is a diagram describing the construction of £. coli expression 
vector pE24CPl. 

1 5 Applicants have provided a sequence listing in conformity with "Rules for 

the Standard Representation of Nucleotide and Amino Acid Sequences in Patent 
Applications" (Annexes I and II to the Decision of the President of the EPO. 
published in Supplement No. 2 to OJ EPO, 12/1992) and with 37 C.F.R. 
1.821-1.825 and Appendices A and B ("Requirements for Application Disclosures 
20 Containing Nucleotides and/or Amino Acid Sequences"). 

SEQ ID NO: 1 presents a partial nucleic acid sequence of an expressed 
sequence lag (EST) bearing GenBank Accession No. T92052 obtained from an 
Arabidopsis thaliana cDNA library. This sequence was contained in clone 
91B13T7 of the library. 
-5 SEQ ID NO:2 presents the initial determination of the nucleic acid sequence 

and the deduced amino acid sequence.of a cDNA encoding a full-length form of 
Arabidopsis thaliana /7-hydroxyphenyipyruvate dioxygenase enzyme, as 
contained in plasmid pGBPPD2. 

SEQ ID N0:3 presents the initially deduced amino acid sequence encoded 
30 by a cDNA for Arabidopsis thaliana /7-hydroxyphenyipyruvate dioxygenase 
enzyme. 

SEQ ID NOS:4 and 5 present the nucleotide sequences of a pair of 
complementary oligonucleotides (CAM 32 and CAM 33, respectively) used to 
facilitate subcloning and expression of the gene encoding /7-hydroxyphenyl- 
35 pyruvate dioxygenase without the chloroplast transit sequence. 

SEQ ID NO:6 presents the amino acid sequence of /7-hydroxyphenyl- 
pyruvate dioxygenase enzyme derived from human (GenBank Acc. No. U29895). 
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SEQ ID NO:7 presents the amino acid sequence of /?-hydroxyphenyl- 
pyruvate dioxygenase enzyme derived from pig (GenBank Acc, No. D 13390). 

SEO ID NO:8 presents the amino acid sequence of /7-hydroxyphenyl- 
pyruvate dioxygenase enzyme derived from mouse (GenBank Acc. No. D29987). 
5 SEQ ID NO:9 presents the amino acid sequence of /^-hydroxyphenyl- 

pyruvate dioxygenase enzyme derived from rat (GenBank Acc. No. MI 8405). 

SEQ ID NO: 10 presents the nucleic acid sequence and deduced amino acid 
sequence of the cloned cDNA encoding the Zea mays /7-hydroxyphenyl pyruvate 
dioxygenase enzyme, as contained in plasmid pMPDO. 
1 0 SEQ ID NO: 1 I presents the deduced amino acid sequence of the cloned 

cDNA encoding the Zea mays p-hydroxyphenylpyruvate dioxygenase enzyme, as 
contained in plasmid pMPDO. 

SEQ ID NO: 12 presents the nucleic acid sequence and the deduced amino 
acid sequence of the truncated form oi Arabidopsis r/?a/ /ana /^hydroxy phenyl - 
] 5 pyruvate dioxygenase enzyme as contained in pE24CPl . 

SEQ ID NO: 1 3 presents the deduced amino acid sequence of the truncated 
form of Arabidopsis thaliana /?-hydroxyphenyIpyruvate dioxygenase enz\'me as 
contained in pE24CPl. 

SEQ ID NO: 14 presents the revised nucleic acid sequence and the deduced 
20 amino acid sequence of the cloned cDNA encoding the full-length Arabidopsis 

thaliana /^-hydroxyphenyipyruvate dioxygenase enzyme, as contained in plasmid 
pGBPPD2. 

SEQ ID NO: 1 5 presents the revised amino acid sequence deduced from the 
cDNA for the full length Arabidopsis thaliana /7-hydroxyphenylpyruvate 
25 dioxygenase enzyme. 

SEQ ID NO: 16 presents the nucleic acid sequence determined from a 
portion of a cDN A from Vernonia galamenensis. as contained in clone 
vsl.pk0015.b2. 

DETAILS OF THE INVENTION 
30 BIOLOGICAL DEPOSITS 

The following biological materials have been deposited under the terms of 
the Budapest Treaty at American Type Culture Collection (ATCC), 12301 
Parklawn Drive, Rockville, MD 20852, and bear the following accession 
numbers: 
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Depositor I dentification imM. Deposilor\- 

Host Strain Plasmid Accession Number Date of nep n^ir 

E. coil BL21(DE3) pE24CPl ATCC 98083 June 25. 1996 

N/A pGBPPD2 ATCC 97622 June 25, 1 996 ' 

N/A pMPDO ATCC 209 1 20 June 1 2, 1 997 

Definitions 

In the context of this disclosure, a number of terms shall be utilized. As 
used herein, the term ^^nucleic acid" refers to a large molecule which can be 
5 single-stranded or double-stranded, composed of monomers (nucleotides ) 

containing a sugar, phosphate and either a punne or pyrimidine. A '^nucleic acid 
fragment^' is a portion of a given nucleic acid molecule. As used herein. "DNA^^ 
(deoxyribonucleic acid) is the genetic material, whereas "RNA" (ribonucleic acid) 
is involved in the transfer of the information encoded by the DNA into proteins 
1 0 and polypeptides. A '^genome" is the entire body of genetic material contained in 
each cell of an organism. The term "nucleotide sequence*^ refers to a polymer of 
DNA or RNA which can be single- or double-stranded, optionally containing 
synthetic, non-natural or altered nucleotide bases capable of incorporation into 
DNA or RNA polymers. 
' ^ "sed herein, "essentially similar" refers to DNA sequences that may 

involve base changes that do not cause a change in the encoded ammo acid or 
which involve base changes which may alter one or more ammo acids, but do not 
affect the functional propenics of the protein encoded by the DNA sequence. It is 
therefore understood that the invention encompasses more than the specific 
20 exemplarv- sequences. Modifications to the sequence, such as deletions. 

insertions, or substitutions in the sequence which produce "silent changes" (i.e.. 
those that do not substantially affect the functional properties of the resulting 
protein molecule) are also contemplated. For example, alicralion(s) in the gene 
sequence which reflects the degeneracy of the genetic code, or which result in the 
2^ production of a chemically equivalent amino acid at a given site, are 

contemplated; thus, a codon for the amino acid alanine, a hydrophobic amino acid, 
may be substituted by a codon encoding another less hydrophobic residue, such as 
glycine, or a more hydrophobic residue, such as valine, leucine, or isoleucine. 
Similarly, changes which result in substitution of one negatively charged residue 
30 for another, such as aspartic acid for glutamic acid, or one positively charged 

residue for another, such as lysine for argininc, can also be expected to produce a 
biologically equivalent product. Nucleotide changes which result in alteration of 
the N-terminal and C-terminal portions of the protein molecule would also not be 
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expected to alter the activity of the protein. In some cases, it may in fact be 
desirable to make mutants of the sequence in order to study the effect of alteration 
on the biological activity of the protein. Each of the proposed modifications is 
well within the routine skill in the art, as is determination of retention of 
5 biological activity of the encoded products. Moreover, the skilled artisan 

recognizes that ''essentially similar" sequences encompassed by this invention arc 
also defined by their ability to hybridize, under stringent conditions (O.IX SSC. 
0. 1% SDS, 65°C), with the sequences exemplified herein. 

"Gene" refers to a nucleic acid fragment that encodes a specific protein, 
10 including regulatory sequences preceding (5' non-coding) and following (3' non- 
coding) the coding region. ''Native'' gene refers to the gene as found in nature 
with its own regulatory sequences. ''Chimeric" gene refers to a gene comprising 
heterogeneous regulatory and coding sequences. "Endogenous" gene refers to the 
native gene normally found in its natural location in the genome. A ''foreign*' 
1 5 gene refers to a gene not normally found in the host organism but that is 
introduced by gene transfer. 

"Coding sequence" refers to a DNA sequence that codes for a specific 
protein and excludes the non-coding sequences. 

''Initiation codon'' and "termination codon" refer to a unit of three adjacent 
20 nucleotides in a coding sequence that specifies initiation and termination, 

respectively, of protein synthesis (mRNA translation). "Open reading frame" 
refers to the amino acid sequence encoded between translation initiation and 
termination codons of a coding sequence. 

"RNA transcript" refers to the product resulting from RNA polymerase- 
25 catalyzed transcription of a DNA sequence. When the RNA transcript is a perfect 
complementary copy of the DNA sequence, it is referred to as the primary 
transcript or it may be a RNA sequence derived from posttranscripiional 
processing of the primary transcript. "Messenger RNA" (mRNA) refers to RNA 
that can be translated into protein by the cell. "cDNA'^ refers to a double-stranded 
30 DNA, one strand of which is complementary to and derived from mRNA by 
reverse transcription. "Sense RNA" refers to RNA transcript that includes the 
mRNA. 

As used herein, "regulatory sequences" are nucleotide sequences that control 
the transcription or expression of a coding sequence located upstream (5'), within, 
35 or downstream (3') to the coding sequence, act in conjunction with the protein 
biosynthetic apparatus of the cell and include promoters, translation leader 
sequences, transcription termination sequences, and polyadenylation sequences. 
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'Tromoier" refers to a DNA sequence in a gene, usually upstream (5') to its 
coding sequence, which controls the expression of the coding sequence by 
providing the recognition for RNA polymerase and other factors required for 
proper transcription. A promoter may also contain DNA sequences that are 
5 involved in the binding of protein factors which control the effectiveness of 

transcription initiation in response lo physiological or developmental conditions. 
In the case of eukaryotic organisms, it may also contain enhancer elements. 

An "enhancer element" is a DNA sequence which can stimulate promoter 
activity. It may be an innate element of the promoter or a heterologous elemeni 

10 inserted to enhance the activity level and tissue-specificity of a promoter. 
^^Constitutive promoters^' refer to those enhancer elements that direct gene 
expression in all tissues and at all times. "Organ-specific'^ or "development- 
specific'^ promoters as referred to herein are those that direct gene expression 
almost exclusively in specific organs, such as leaves or seeds, or at specific 

1 5 development stages in an organ, such as in early or late embr>'ogenesis, 
respectively. 

The term "operably linked"* refers to nucleic acid sequences on a single 
nucleic acid molecule which are associated so that the function of one is affected 
by the other. For example, a promoter is operably linked with a structural gene 

20 (i.e., a gene encoding /?-hydroxyphcnylpyruvate dioxygenasc. as disclosed herein) 
when it is capable of affecting the expression of that structural gene (i.e., that the 
structural gene is under the transcriptional control of the promoter). 

The term "expression", as used herein, is intended to mean the production of 
the protein product encoded by a gene. More particularly, "expression'' refers to 

25 the transcription and stable accumulation of the sense RNA (mRNA) derived from 
the nucleic acid fragmeni(s) of the invention that, in conjuction with the protein 
apparatus of the cell, results in altered levels of protein product. 
"Overexpression" refers to the production of a gene product in transgenic 
organisms that exceeds levels of production in normal or non-transformed 

30 organisms. "Altered levels'^ refers to the production of gene product(s) in 

transgenic organisms in amounts or proportions that differ from that of normal or 
non-transformed organisms. "Facilitating expression ' refers to steps and 
conditions for culturing host cells containing the desirable gene to yield an 
increased production of the enzyme. For example, addition of a chemical inducer 

35 specific to the panicular promoter operably linked to the gene facilitates 

expression of the encoded enzyme. This is measured relative to the production 
levels of an untreated gene. 
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The ''3* non-coding sequences" refers to the DNA sequence portion of a 
gene that contains a polyadcnylation signal and any other regulator.' signal 
capable of affecting mRNA processing or gene expression. The polyadenylaiion 
signal is usually characterized by affecting the addition of polyadenylic acid tracts 
5 to the 3' end of the mRNA precursor. 

The "translation leader sequence" refers to that DNA sequence ponion of a 
gene between the promoter and coding sequence that is transcribed into RNA and 
is present in the fully processed mRNA upstream (5') of the translation start 
codon. The translation leader sequence may affect processing of the primary 
10 transcript to mRNA, mRNA stability, or translation efficiency. 

"Transformation" herein refers to the transfer of a foreign gene into the 
genome of a host organism and its genetically stable inheritance. Bacterial 
transformation can proceed by any of several methods well known in the art, 
including calcium chloride-mediated transformation and electroporation. 
1 5 Examples of methods of plant transformation include Agrohacfcrium-medialed 
transformation and particle-accelerated or "gene gun" transfonnaiion technology 
(U.S. Patent No. 4,945,050). 

''Host ceir' refers to the cell that is transformed with the introduced genetic 
material. 

20 *'IMasmid vector" refers to a double-stranded, closed circular, extra- 

chromosomal DNA molecule. 

"Tolerant" or "tolerance" refers to a condition whereby a cell or an organism 
is able to withstand the effect of application of a compound or composition at a 
concentration or application rate that causes a demonstrable effect in or against 

25 cells or organisms that are not tolerant. For example, the growth or survival of a 
plant that is tolerant to application of a herbicidal compound or composition will 
be less affected than the growth or survival of a plant that is not tolerant to 
application of the herbicidal compound or composition. 
Cloning of Plant Genes Encoding p-Hvdroxvphenvlnvruvate Dioxvuenase 

30 The /?-hydroxyphenylpyruvate dioxygenases from plants are a promising 

new class of targets for new herbicidal compounds. In order to be able to study 
this enzyme in detail, and to have available supplies of enzyme for inhibitor 
screening, cDNA clones encoding plant /?-hydroxyphenylpyruvate dioxygenases 
were identified. These nucleic acid fragments are useftil for the production of 

35 their encoded enzymes, for isolation of clones from additional plant sources that 
encode other p-hydroxyphenylpyruvate dioxygenase enzymes, and for 
understanding the biochemical and structural properties of these enzymes. 
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Nucleic acid fragments comprising nucleotide sequences that encode 
different forms of the enzyme p-hydroxyphenylpyruvaie dioxygenase from the 
plant Arabidopsis (haliana have now been isolated. Subsequently, these 
nucleotide sequences were expressed in £. coli cells and shown to direct the 
synthesis of plant />hydroxyphenylpyruvatc dioxygenase enzymes. 

An automated search of nucleotide sequences contained in a database 
representing an Arabidopsis cDNA library for sequences homologous to other 
known, non-plant y?-hydroxyphenylpyruvate dioxygenase genes revealed the 
plasmid cDNA clone 91B13T7. This cDNA was obtained from the Arabidopsis 
Seed Stock Center at Ohio State University. Plasmid DNA suitable for nucleotide 
sequence determination was prepared and the nucleotide sequence of the plasmid 
insert was determined. The resulting sequence was not interpretable. suggesting 
possible contamination of the plasmid sample by an extraneous nucleic acid. This 
assumption was confirmed by digestmg the plasmid DNA sample with restriction 
enzymes and separating the resuhinc nucleic acid fragments by agarose gel 
electrophoresis. This analysis revealed the presence of nucleic acid fragments that 
could not be derived from the plasmid carrying the putative /?-hydroxyphenyl- 
pyruvate dioxygenase fragment. Furthermore, a search of the publically available 
nucleic acid sequence databases revealed that the Arabidopsis (haliana sequence 
reported for cDNA clone 91B13T7 corresponded to a truncated cDNA (Figure 1). 
Based on publically available mammalian cDNA sequence information for 
/;-hydroxyphenylpyruvate dioxygenase, the minimum length expected for a cDNA 
encoding a complete /?-hydroxyphenylpyruvate dioxygenase enzyme is 1 kb 
(Table 1). 



Table 1 

Predicted cDNA Length for Sequences 
Encoding p-Hydroxyphenylpyruvatc Dioxygenase 





Amino Acid 




Orsanism 


Residues 


Vtinimum cDNA (kb) 


Human 


392 


1.176 


Pig 


392 


1.176 


Pseudomonas sp. 


357 


1.071 



Therefore, based on the expected length of a cDNA capable of encoding a 
functional />-hydroxyphenylpyruvate dioxygenase. Xho Arabidopsis thaliana 
sequence obtained from the public database was insufficient to encode a full- 
length, active /7-hydroxyphenylpyruvate dioxygenase enzyme. Therefore, a cDNA 
with the capacity to encode a full-length enzyme Arabidopsis thaliana was cloned, 

10 
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as described herein. A 400 bp segment of the insert of plasmid 9 1 B 1 3T7 was 
liberated by digestion with restriction enzymes and used to screen a cDNA library 
prepared from nortlurazon-trcatcd /lw/7/Wo/75/.v ihaliana seedlings (Scolnik. P. A., 
and Hartley. G. E. (1994) Plani Physiol. 104:1469-1470). Several clones showing 
5 positive hybridization to this probe were sequenced. The initial determination of 
the sequence of the longest cDNA clone obtained from this effort is shown in 
Figure 2 and in SEQ ID N0:2. During the course of subsequent work with this 
clone it became necessary to confirm cenain features of the sequence. A corrected 
sequence of this cDNA is presented in SEQ ID NO: 12. 
1 0 The sequence reported in Figure 2 indicates that this cDN A has the capacity 

' to encode a protein of MW 48.841 which, as shown in Figure .-. has a high level 
of homology to /j-hydroxyphenylpyruvate dioxygenase enzymes from other 
eukaryotes. 

A cDNA capable of encoding a lull-length /7-hydroxyphenyipyruvate 

1 5 dioxygenase has also been obtained from com. This cDNA. contained in plasmid 
pMPDO. was identified in a com cDNA library using an approximately 900 base 
pairs portion of the Arabidopsis cDNA as a probe. The predicted amino acid 
sequence that is encoded by the corn cDNA is 'also ompare-l to p-hydroxypheny- 
Ipyruvate dioxygenase enzymes from other ei'-karyotes in Figure 3. 

20 A cDN A library was prepared from messenger RN A isolated from 

developing seeds of Vernonia galamenensis Random sequencing of the clones 
contained in the library identified a probab'e clone, designated vsl .pk0015.b2. for 
the /7-hydroxyphenylpyruvate dioxygenase from this plant. The 5 1 3 bp expressed 
sequence tag (EST) is presented in SEQ ID NO: 16. 

25 Expression of the Arahidomis thaliana cDT ^ A Encoding p-Hvdroxvphenvl- 
pvruvate Dioxvgenasc in E. coli 

The nucleic acid fragments of the i istani invention encoding a plant 
p-hydroxyphcnylpyrvivatc dioxygenase enzymes can be operably linked to suitable 
regulatory sequences, thereby creating chimeric genes that can be used to direct 

30 expression of the enz>'me in transge nic organisms. These transgenic organisms 
include, but are not limited to: plaints {Plant Molecular Biology; Croy. R. R. D., 
Ed.; Bios Scientific Publishers: l^'93); microorganisms, including Escherichia 
coli (Gold, L. ( 1 990) Methods in ,Snzymology 185:11), Bacillus subtilis (Henner. 
D. J. (1990) Methods in Enzymo'iogy 185:199), yeast (Gcllissen. G., et al. (1992) 

35 Antonie Leeuwenhoek 62:79). a-nd ftingi, including members of the genus 

Aspergillus (Devchand. M. ai^.d Gwynne, D. I. (1991) J. Diotechnol. 17:3); and 
insect cells containing recor^nbinant baculoviruses (Lukow. V. A. and Summers, 
M. D. (1988) Bio/Techr/tology 6:47). 
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One skilled in the art can isolate the coding sequences from the fragments of 
the invention by using or creating sites for restriction endonucleases. as described 
in Sambrook, J., et al.((1989) Molecular Cloning. A Laboratory Manual. 2nd ed.; 
Cold Spring Harbor Laboratory Press; hereinafter "Maniatis"). Alternatively. 
5 polymerase chain reaction (PCR) techniques can be employed to isolate and'or 
modify the fragments of the invention (Newton, C. R. and Graham. A. (1994) 
PCR; Bios Scientific Publishers). 

Arabidopsis /J-hydroxyphenylpyruvaie dioxygenase was expressed in E. colt 
under control of a T7 promoter in a strain expressing T7 RNA polymerase 
10 (Studier. F. W., et al. (1990) Methods in Enzymology 185:60). Promoters other 
than T7 are commonly used in expression vectors and could he substituted for 
protem expression mE. coli. Examples of alternative promoters include, but are 
not limited to. irp (Yansura. D. G. and Hcnncr, D. J. (1990) Methods in 
Enzymology 185:54), Pl (Remaut. E. ci al. (1981) Gcw i5.-81 ). tac (Amann. E. et 
!:> al. (1983) Gene 25:167), trc (Amann. E. et al. (1988) Gene 69:301). and 

promoters such as lacL/VS. Ipp. Pr, and hybrid and tandem promoters constructed 
to conbine specify- ieaturcs to increa.se strength or regulation capacity (Balbas. P. 
and Bolivar. F. » 1990) Methods in Enzyinoloff^- 185:14). 
Biochemical F.videncf. o f Enzymatic Function 
20 The enzyme /'-h /droxyphenylpyruvate dioxygenase catalyzes the reaction of 

p-hydroxyphenyipyruv.ue with molecular oxygen to give homogenfi.sate and CO.. • 
The enzyme can be assayed by measuring oxygen utilization (Hager, S. E.. et al. 

(1957) J. Biol. Chem. 225:935-947), CO. release or homogenti.sa^e production 
from radioactive labeleo /.-hydroxyphenylpyruvate (Lindblad, B. ( 1 97 1 ) Clin. 

25 Chem. ^c/a 34: 1 13-121 \ loss of the /7-hydroxyphenylpyruvate (Lin. E. C. C. et al. 

(1958) ./ Biol. Chem. 233.668-673). or formation of homogenti.sate using a 
colorimctric assay (Fellman. j. H. et al. (1972) Diochim. Biophys Acta 
284:90-100) or UV detection following HPLC or a similar chromatographic 
separation technique. The activity of p-hydroxyphenylpyruvate dioxygenase ma>- 

30 also be measured in a coupled ac;say in which the initial product, homogentisate. is 
oxidized by homogentisate dioxyijenase: formation of maleylacetoacetate 
determined by measuring absorba'nce at 330 nm (Femandez-Canon. J. M. and 
Penalva, M. A. {\991) Anal. Bioclhem. 245:218-221). 

An alternative to any of the l^unetic assays for ;7-hydroxyphenylpyruvate 

35 dioxygenase is an end-point or fixedVtime assay. The procedure is based on the 
conversion of unconverted substrate, p -hydroxyphenylpyruvate to its enediol 
tautomer by tautomerase in the presence o: '- borate ions and measurement of the 
characteristic 308 nm peak of the tautomer (L in. L. C. C. et al. (1958) J. Biol. 
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Chem. 233:668-673). The procedure involves the addition of enough 
/7-hydroxyphenylpyruvate dioxygenase lo consume -80% of the organic substrate 
over 1 hour in 200 |iL of assay buffer, which in this case is a 50 mM Tris, pH 7.4. 
0.10 mM p-hydroxyphenylpyruvic acid, 1.75 mM ascorbate and 1.25 mM EDTA. 
5 After 1 hr the reaction is quenched by the addition of 100 \xL of 0.8 M borate, 

pH 7.3. containing 1000 ppb of a/?-hydroxyphcnylpyruvate dioxygenase inhibitor 
and 0.25 |iL of 6.1 mg/mL of tautomerase. The absorbance at 308 nm is read after 
a 30 min incubation and is stable thereafter for 2 hr. The advantage of this assay 
over the kinetic procedure is that the /?-hydroxyphenyi pyruvate dioxygenase is not 

1 0 required to oxidize the substrate in the presence of high concentrations of borate, a 
condition that might interfere with the mode of action of inhibitors. Furthermore 
the assay produces essentially a stable binary indication of /7-hydroxypheny- 
Ipyruvate dioxygenase inhibition, and is well-suited for applications which require 
a high-throughput of samples and assays. 

1 5 The enzyme encoded by the nucleic acid fragments and overexpressed in 

E. coli can be extracted in any conventional buffer used for extracting soluble 
plant enzymes. Although a large amount of an overexpressed protein is often 
insoluble, the amount that is soluble represents can represent as much as 50% of 
the total soluble protein. Soluble overexpressed protein has high/?-hydroxy- 

20 phenylpyruvate dioxygenase activity and is easily extracted. Likewise, it may be 
possible to resolubilize an insoluble overexpressed protein in an active form under 
appropriate conditions, since addition of sarkosyl (sodium N-lauroylsarcosinate) 
to the extraction buffer appeared to increase the amount of the overexpressed 
protein extracted. For optimum activity, a reducing agent such as ascorbate or 

25 reduced glutathione should be present as well as a source a ferrous ion. 

An overexpressed enzyme can be assayed using all the techniques 
described above for measuring /7-hydroxy phenylpyruvate dioxygenase activity, 
while only the techniques using labeled p-hydroxyphenylpyruvate can be used to 
measure activity in crude plant extracts. Therefore, the availability of an 

30 overexpressed enzyme greatly facilitates the development of high capacity screens 
to identify inhibitors of the enzyme. Potential inhibitors are evaluated for their 
capacity to reduce the rate of the reaction of the enzyme, resulting in reduced 
oxygen uptake and CO2 release, and lower rates of formation of homogentisate 
and loss of p-hydroxyphenylpyruvate. Applicants have demonstrated that at least 

35 one of the instant nucleic acid fragments can be overexpressed in E. coli cells, 
resulting in production of a protein that catalyzes the conversion of /?-hydroxy- 
phenylpyruvate to homogentisate with the release of CO2. Furthermore, it has 
been shown that this activity is inhibited by commercial herbicides known to 
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inhibit />hydroxyphenyipyruvate dioxygenasc. Finally, an overexpressed enzyme 
can be used in a hiyh capacity assay to identify compounds that inhibit the 
enzymatic activity ol>-hydroxyphenylpyruvate dioxygenase. Such compounds 
may serve as herbicides. 

^ Preparation of Plants Tol erant to Inhibitors of /7-HvdroxvDhenvlp vn.vafP 
Dioxygenase 

This invention embodies plants which are resistant or at least tolerant to 
herbicides that target thep-hydroxyphenylpyruvate dioxygenase enzyme at levels 
which are normally inhibitory to the naturally occurring /7-hydroxyphenylpyruvate 
dioxygenase enzyme. This altered />hydroxyphcnylpyriivate dioxygenase activity 
is conferred by ( I ) overexpression of the wild-type /^-hydroxyphenylpyruvate 
dioxygenase enzyme, or (2) expression of a DNA molecule encoding a herbicide- 
tolerant enzyme. The said enzyme may be a modified form of an/^-hydroxy- 
phenyipynivate dioxygenase enzyme that occurs naturally in a cukaryote or 
prokaryote. or a modified form of an /^hydroxyphcnylpyruvatc dioxygena.se 
enzN-mc that naturally occurs in a plant, or a herbicide tolerant enzyme that 
naturally occurs in a prokaryote fDuke et ai. Herbicide Resistant Crops: Lewis: 
Boca Raton; 1994). An effective amount of gene expression to render the cells of 
the plant tissue substantially tolerant to the herbicide depends on whether the gene 
codes for an unaltered /7-hydroxyphcnylpyruvate dioxygenase gene or a mutant or 
altered form of the gene that is less sensitive to the herbicides. Expression of an 
unaltered plant /7-hydroxyphenylpyruvate dioxygenase gene in an effective 
amount is that amount that provides for a 2- to 10-fold increase in herbicide 
tolerance. Plants encompassed by the invention include monocotylcdoneous and 
dicotyledoneous plants. Preferred are tho.se plants which would be potential 
targets forp-hydroxyphenylpyruvate dioxygenase-inhibiting herbicides, 
particularly agronomically important crops such as maize and other cereal crops. 

Increased levels of expression of />hydroxyphenylpyruvate dioxygenasc 
activity, from two to ten or more times the natively expressed amount, would be 
30 sufficient to overcome growth inhibition caused by the herbicide. Plants 

containing such altered />hydroxyphenylpyruvate dioxygenasc enzyme activity 
can be obtained by direct selection in plants. This method is known in the art. 
See, e.g.. U.S. Patent No. 5,162,602. U.S. Patent No. 4,761.373, and references 
cited therein. 

35 Overexpression of/7-hydroxyphenylpyruvate dioxygenase also can be 

accomplished by stably transforming a host plant cell with a chimeric DNA 
molecule comprising a promoter capable of driving expression of an associated 
coding sequence in a plant cell and operably linked to a homologous or 
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heterologous coding sequence encoding p-hydroxyphenylpyruvate dioxygenase. 
A "homologous" /?-hydroxyphenylpyruvate dioxygenase gene is isolated (rom an 
organism taxonomically identical to the target plant cell, whereas a ^'heterologous" 
/7-hydroxyphenylpyruvate dioxygenase gene is obtained from an organism 
5 taxonomically distinct from the target plant. 

The expression of foreign genes in plants is well-established (De Blaere et 
al., (1987) Meth. Enzymoi 143:277-291). Promoters utilized to drive gene 
expression in transgenic plants or plant cells (i.e.. those capable of driving 
expression of the associated coding sequences such as /7-hydroxyphenylpyruvate 

10 dioxygenase in plant cells, include those directing the 19S and 35S transcripts in 
Cauliflower mosaic virus (Odell et aL, (1985) Nature 313:810-812; Hull et aL. 
(1987) Virology 86:482-493), small subunit of ribulosc 1 .5-bisphosphate 
carboxylase (Morelli et al., f 1985) jVa/wrc' 315:200-204: Broglie et al., (1984) 
Science 224:838-843: Hercrra-Eslrella et al.. (1984) Nature 310:1 15-120: Coruzzi 

15 etal., (1984) EMBOJ. 3:1671-1679: Faciotti et ah, (1985) Bio/Tcchnolu^^ 1^ .2-^\ 
and chlorophyll a/b binding protein (Lamppa et al.. ( 1 986) Nature 3 16:750-752): 
nopaline synthase promoters (Dcpickcr et al. (1982) J. MoL App. Genet. 
/:561-573; Anetal. (1990) /'/^im Ce// 2:225-233). The chimeric DNA 
construct(s) of the invention may contain multiple copies of a promoter or 

20 multiple copies of the /?-hydroxyphenylpyruvate dioxygenase coding sequences. 
In addition, the construct(s) may include coding sequences for selectable markers 
and coding sequences for other peptides such as signal or transit peptides. The 
preparation of such constructs is within the ordinary level of skill in the art. 
Resistance to inhibitors of the plant carotcnoid biosynthesis pathway, which is 

25 also targeted by /7-hydroxyphenylpyruvate dioxygenase inhibitors, has been 

achieved by expressing a bacterial gene encoding phytoene desaturase driven by 
the CaMV promoter (Misawa et aL, (1994) Plant, ,/ V:48 1-490). 

Transit peptides may be fused to the /^-hydroxyphenyipyruvate dioxygenase 
coding sequence in the chimeric DNA constructs of the invention to direct 

30 transport of the expressed /7-hydroxyphenylpyruvate dioxygenase enzyme to the 
desired site of action. Examples of transit peptides include the chloroplast transit 
peptides such as those described in Von Heijne et aL, (1991) Plant Mol. BioL Rep. 
9:104-126; Mazur et aL, (1987) Plant Physiol. 85:1 1 10; Vorst et al., (1988) Gene 
65:59: and mitochondrial transit peptides such as those described in Boutr>' et aL. 

35 (1987) Nature 328:340-342. 

It is envisioned that the introduction of enhancers or enhancer-like elements 
into other promoter constructs will also provide increased levels of primary 
transcription to accomplish the invention. These would include viral enhancers 
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such as that found in the 35S promoter (Odell ei al., (1988) Plant Mol. Biol. 
10:263-272), enhancers from the opine genes (Fromm et al., (1989) Plant Cell 
1 :977-984), or enhancers from any other source that result in increased 
transcription when placed into a promoter operably linked to the nucleic acid 
5 fragment of the invention. 

Introns isolated from the maize Adh-1 and Bz-1 genes (Callis et al., (1987) 
Genes Dev. I : II 83- 1200), and intron 1 and exon I of the maize Shrunken- 1 (sh-1) 
gene (Maas et al.. (1991) Plant Mol Biol 16:199-207) may also be of use to 
increase expression of introduced genes. Results with the first intron of the maize 
1 0 alcohol dehydrogenase ( Adh- 1 ) gene indicate that when this DNA element is 

placed within the transcriptional unit of a heterologous gene, mRNA levels can be 
increased by 6.7-fold over normal levels. Similar levels of intron enhancement 
have been obser\'ed using intron 3 of a maize actin gene (Luehrsen. K. R. and 
Walbot, v.. (1991 ) Mol Gen. Genet. 225:81-93). Enhancement of gene 
15 expression by Adhl intron 6 (Oard ct al.. (1989) Plant Call Rep 8:156-160) has 
also been noted. Exon I and intron 1 of the maize sh-l gene have been shown to 
individually increase expression of reporter genes in maize suspension cultures by 
10 and 1 00-fold, respectively. When used in combination, these elements have 
been shown to produce up to 1000-fold stimulation of reporter gene expression 
20 (Maaset al., (I99I)/'/a«/A/o/. Biol 16:199-207). 

Any 3' non-coding region capable of providing a polyadenylation signal and 
other reguIator\' sequences that may be required for proper expression can be used 
to accomplish the invention. This would include the 3' end from any storage 
protein such as the 3' end of the lOkd, 15kd. 27kd and alpha zcin genes, the 3' end 
25 of the bean phaseolin gene, the 3' end of the soybean |:i-conglycinin gene, the 3' 
end from viral genes such as the 3' end of the 35S or the 19S cauliflower mosaic 
virus transcripts, the 3' end from the opine synthesis genes, the 3' ends of ribulose 
1,5-bisphosphaie carboxylase or chlorophyll a/b binding protein, or 3* end 
sequences from any source such that the sequence employed provides the 
30 necessary regulatory information within its nucleic acid sequence to result in the 
proper expression of the promoter/coding region combination to which it is 
operably linked. There are numerous examples in the art that teach the usefulness 
of different 3' non-coding regions (for example, see Ingelbrecht et aL, (1989) 
Plant Cell 1:671-680). 
35 Various methods of introducing a DNA sequence (i.e., of transforming) into 

eukaryotic cells of higher plants are available to those skilled in the art (see EPO 
publications 0 295 959 A2 and 0 1 38 34 1 A I). Such methods include high- 
velocity ballistic bombardment with metal particles coated with the nucleic acid 
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constructs (see Klein et aL, (1987) Nature (London) 327:70-73. and see U.S. 
Patent No. 4,945,050), as well as those based on transformation vectors based on 
the Ti and Ri plasmids of Agrohacterium spp., particularly the binary type of these 
vectors. Ti-derived vectors transform a wide variety of higher plants, including 
5 monocotyledonous and dicotyledonous plants, such as soybean, cotton and rape 
seed (Pacciotti et al., (1985) Bio/Technolog}' 3:241 ; Byrne et aL, (1987) Plan( 
Cell, Tissue and Organ Culture 8:3; Sukhapindaet aL, (1987) Plant Mol. Biol. 
8:209-216; Lorzetal.. (1985) A/r;/. Gen. Genet. 199:178-182: Potrykus et aL, 
(1985) Mo/. Gen. Genet. 199:183-188). 

10 Other transformation methods arc available to those skilled in the art. such 

as direct uptake of foreign DNA constructs (see EPO publication 0 295 959 A2), 
and techniques of electroporation (see Fromm et aL, (1986) Nature (London) 
3 1 9:79 1 -793). Once transformed, the cells can be regenerated by those skilled in 
the art. Also relevant arc several recently described methods of introducing 

1 5 nucleic acid fragments into commercially imponant crops, such as rapeseed (see 
De Block et aL. (1989) Plant Physiol. 91:694-701). sunflower (Everett et aL. 
(1987) Bio/Technology 5:1201-1204), soybean (McCabe et aL. (1988) 
Bio/Technology^ 6:923-926; Hinchee et aL. (1988) Bio/Technology 6:915-922; 
Chee et aL, (1989) Pla?it Physiol. 91:1212-1218: Christou ct aL. (1989) Proc. 

20 Natl. Acad Sci USA 86:7500-7504; EPO Publication 0 301 749 A2), and com 
(Gordon-Kamm et aL, (1990) Plant Cell 2:603-618; and Fromm et aL, (1990) 
Bio/Technolog}> 8:833-839). 

Altered p-hydroxyphenylpyruvate dioxygenase enzyme activity may also be 
achieved through the generation or identification of modified forms of the isolated 

25 eukaryotic /?-hydroxyphenylpyruvate dioxygenase coding sequence having at least 
one amino acid substitution, addition or deletion which encodes an altered 
/^-hydroxyphenylpyruvate dioxygenase enzyme resistant to a herbicide that 
inhibits the unaltered, naturally occurring form. Genes encoding such enzymes 
can be obtained by numerous strategies known in the art. A first general strategy 

30 involves direct or indirect mutagenesis procedures on microbes (e.g., £. coli, 
S. cerevisiae (Miller, (1972) Experiments in Molecular Genetics, Cold Spring 
Harbor Laboratory, Cold Spring Harbor, NY; Davis et aL, (1980) Advanced 
Bacterial Genetics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY; 
Sherman et aL, (1983) Methods in Yeast Genetics, Cold Spring Harbor 

35 Laborator>% Gold Spring Harbor NY; and U.S. Patent No. 4,975,374) and 
cyanobacteria (Bryant, The Molecular Biology of Cyanobaaeria: Kluwer 
Academic Publishers: Boston, 1995). A second method of obtaining mutant 
herbicide-resistant alleles of the eukaryotic p-hydroxyphenylpyruvate dioxygenase 
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enzyme involves direct selection in plants. For example, the effect of inhibitors 
on the growah of plants such as Arabidopsis. soybean, or maize may be 
determined by plating seeds sterilized by art-rccognizcd methods on plates on a 
simple minimal salts medium containing increasing concentrations of the 
5 mhibitor. The lowest dose at which significant growth inhibition can be 

reproducibly detected is used for subsequent experiments. Mutagenesis of plant 
material may be utilized to increase the frequency at which resistant alleles occur 
in the selected population. Mutagenized seed material can be derived from a 
variety of sources, including chemical or physical mutagenesis or seeds, or 
1 0 chemical or physical mutagenesis or pollen (Ncuffer. In Maize for Biological 

Research. Sheridan, ed. Univ. Press, Grand Forks, ND., pp. 61 -64 { 1 982)). which 
is then used to fertilize plants and the resulting Ml mutant seeds collected. 
Typically, for Arabidopsis. M2 seeds (i.e.. progeny seeds of plants grown from 
seeds mutagenized with chemicals, such as ethyl methane sulfonate, or with 
1 ^ physical agents, such as gamma rays or fast neutrons) arc plated at densities of up 
to 10.000 seeds/plate (10 cm diameter) on minimal salts medium containing an 
appropriate concentration of inhibitor. Seedlings that continue to grow and 
remain green 7-2 1 days after plating are transplanted to soil and grown to maturity 
and seed set. Progeny of these seeds are tested for resistance to the herbicide. If 
20 the resistance trait is dominant, plants whose seed segregate 3: 1 

(resistantrsensitive) are presumed to have been heterozygous for the resistance at 
the M2 generation. Plants that give rise to all resistant seed are presumed to have 
been homozygous for the resistance at the M2 generation. Such mutagenesis on 
intact seeds and screening of their M2 progeny seed can also be carried out on 
25 other species, for instance .soybean (see. e.g.. U.S. Patent No. 5.084.082). Mutant 
seeds to be screened for herbicide tolerance can also be obtained as a result of 
fertilization with pollen mutagenized by chemical or physical means. 

EXAMPLE 1 
Cloning of a cDNA for Aruhidonsi.s lhaliana 
p-Hvdrox vnhenvlpvruvatc Dioxvcenase 
The plasmid containing the Arabidopsis thaliana 91 B 1 3T7 expressed 
sequence tag (Newman et al., (1994) Plant Physiol 106:1241-1255) was digested 
with the restriction enzymes ZJamHI and EcoKl. and the resulting 400 bp fragment 
was used to screen a lambda phage cDNA library oi Arabidopsis thaliana 
35 seedlings (Scolnik, P. A. and Bartley, G. E. (1994) Plant Physiol. 104:1469-1470) 
according to the following protocol. 

E. coli KW251 cells were grown overnight in Luria Broth ("LB") containing 
0.2% maltose and 10 mM MgS04. Cells were pelleted by centrifugation and 
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resuspended in 10 mM MgS04 to an OD^oO O-^- aliquots (0.8 mL) were 
mixed with 0.1 mL of diluted phage samples and 7 mL of top agarose (0.7% 
agarose in LB containing 10 mM MgS04) at 45°C, and plated onto 150 mm Petri 
dishes containing LB agar. Phage plaques became visible in 5-7 h, at which point 
5 the plates were placed at 4°C. 

Phage plaques were transferred to nitrocellulose filters according to standard 
techniques, and the filters were hybrized to ^^P-radiolabeled probe prepared 
according to the method of Feinberg and Vogelstein ((1983) Anal. Biochem. 
132:6-13), using the hybridization conditions of Berlyn et al.((1989) Proc. Natl. 
10 Acad. Sci. 86:4604-4608). After exposure to X-ray film for 48 h, 12 positive 

plaques were eluted, plated, and hybridized under the same conditions. A total of 
9 plaques that retained positive signals in this second round of hybridization were 
subjected to in vivo excision using the Exassisl/SOLR^'^ system according to the 
manufacturer's protocol (Stratagene Clonmg Systems, La JoUa. CA). DNA from 
15 the plasmids resulting from in vivo excision of positive plaques was prepared for 
DNA sequencing using the Wizard Plus^^' kit (Promega. Madison, WI). Eight of 
the clones that were sequenced showed strong conservation with available 
p-hydroxyphenylpyruvate dioxygena,se sequences, whereas the remaining clone 
did not correspond to a /?-hydroxyphenyIpyruvate dioxygenase. Alignment with 
20 known /?-hydroxyphenylpyruvate dioxygenase sequences also revealed that two of 
the clones correspond to 0.3 kbp fragments from the 3' end of the transcript, and 
another two to 1 .2 kbp fragments from the 5' end of the transcript. One clone of 
each was used to assemble a 1.5 kbp cDNA by ligating at the internal Nhe] 
restriction site (Figure 1). The initial determination of the DNA sequence (SEO 
25 ID N0:2) of the resulting cDNA clone is shown in Figure 2. Subsequent work 
with this DNA fragment required confirmation of some of the features of its 
sequence. Approximately ten nucleotide residues were found to have been listed 
in error. Thus a corrected sequence for this DNA fragment is listed in SEQ ID 
NO: 14 and the deduced amino acid sequence is set forth in SEQ ID NO: 15. The 
30 revised sequences form the bases for analyses and comparisons reported herein. 

EXAMPLE 2 
Overexpression of the Arahidonsis cDNA in E. coli 
The deduced amino acid sequence for Arahidopsis p-hydvoxy^h^nyl- 
pyruvate dioxygenase was aligned with the amino acid sequences of 
3 5 />hydroxyphenylpyruvate dioxygenase from mouse, pig, and Streptomyces 

avermitilis using the Pileup program of GCG (Program Manual for the Wisconsin 
Package, Version 8, September 1994, Genetics Computer Group, 575 Science 
Drive, Madison, WL USA 5371 1). This analysis suggested an additional 
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29 amino acid-extension at the amino terminus of the Arabidopsis sequence 
(positions 1-29, Figure 3 and SEQ ID NO:3), This amino-terminal extension was 
assumed to be a chloroplast transit peptide whicli would be absent from the 
mature enzyme. Therefore, removal of the chloroplast transit peptide coding 
5 sequence coincided with transfer of the p-hydroxyphenylpyruvate dioxygenase 
coding sequence from the cloning vector into the expression vector. 

The Arahidopsis p-hydroxyphenylpyruvate, dioxygenase cDNA was moved 
from the pBluescript SK- cloning vector (Stratagenc, La Jolla, CA) to the 
pET24c(+) expression vector (Novagen, Madison, WI) through the intermediate 
1 0 cloning vector pT7BIueR (Novagen). The plasmid pGBPPD2 consists of the 
Arabidopsis p-hydroxyph^ny\^yrnva.ie dioxygenase cDNA and the pBluescript 
SK- cloning vector fStratagene). The plasmid pE24CPI consists of the 
.'lra/),c^o/7.v/.v/?-hydroxyphenylpyruvate dioxygenase cDNA. without the putative 
chloroplast transit peptide DNA sequence, and the pET24c(+) expression vector 
15 (Novagen). 

The plasmids pGBPPD2 and pfTBlueK (5 yxg each) were individuallv 
digested with 20 units of Xba 1 (New England Biolabs. NEB. Beverly. MA) and 
20 units of Hind III (Gibco BRL. Gaithersburg, MD) in NEB restriction enzyme 
buffer 2 supplemented with 100 Mg/mL bovine serum albumin at 37 =C for 1.75 h. 
Digesting PGBPPD2 with the restriction enzymes Xba I and Mind III releases the 
5' and 3- ends, respectively, of the /^-hydroxyphcnylpyruvate dioxygenase cDNA 
from the pBluescript SK- poly linker. Products of the digestion were electro- 
phoretically separated in a 1 percent agarose gel using TRIS/acetate/EDTA (TAE) 
buffer and visualized with ethidium bromide staining (Maniaiis). Digestion of 
pGBPPD2 with the two restriction endonucleases resulted in a 2922 bp vector 
band and 1499 bpp-hydroxyphenylpyruvate dioxygenase cDNA band. Only a 
2863 bp band was apparent after digesting pT7BlucR with the two enzymes, 
although a 24 bp fragment would also result. The 1499 bp /^-hydroxypheny- 
Ipyruvate dioxygenase band and the 2863 bp T7BlueR band were cut out of the 
gel and the associated DNA purified from the agarose using a QIAquick Gel 
Extraction Kit (Qiagen. Chatsworth. CA) according to the manufacturer's 
instructions. The purified DNA samples were precipitated by the addition of 
sodium acetate (pH 5.2) to 0.3 M. 10 pg tRNA (added as carrier), two volumes of 
-20 °C ethanol and incubation at -20 °C overnight. Nucleic acid pellets were 
collected by centrifijgation, washed with 70% ethanol and air dried. Both pellets 
were solublized in 10 of TRJS/EDTA (TE) buffer, pH 8 (Maniatis), and then 
1 \xL of each sample loaded onto a 1% agarose. TAE gel in separate wells next to 
a well containing 4 nL of Mass Ladder (Gibco BRL). All samples were adjusted 
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to 10 nL with water before loading- DNA was quantified by comparing band 
intensities of each sample with Mass Ladder band intensities following ethidium 
bromide staining and UV illumination. 

Approximately 300 ng of p-hydroxyphenylpyruvaie dioxygenase insert was 
5 mixed with 300 ng of double digested pT7BlueR vector in a total volume of 7 f.iL 
and then heated to 45 ""C for 5 min followed by cooling on ice. T4 DNA ligase 
buffer (Gibco BRL) and 1 unit of T4 DNA ligase (Gibco BRL) were added to the 
cooled DNA for a total volume of 10 ^L. The ligation mix was incubated at room 
temperature for 4 h and then transformed into MAX Efficiency DH5a Competent 
10 Ceils (Gibco BRL) of £ coli according to standard procedures (Maniatis). 
Transformed bacteria were spread onto LB agar plates supplemented with 
1 00 lag/mL carbenicillin and incubated overnight at 37 °C- Seventeen bacterial 
colonies were selected for subsequent analysis. A portion of each colony was 
inoculated into a separate 17x100 mm polypropylene culture tube (Falcon, 
1 5 Lincoln Park. NJ) containing 2 mL of liquid LB media and 200 fig/mL 

carbenicillin. Liquid bacteria cultures were incubated overnight at 37 °C with 
shaking (250 rpm). Plasmid DNA was then isolated using a QlAprep Spin 
Plasmid Miniprep Kit (Qiagen) according to the manufacturer's instructions. A 
ponion (5 |iL out of 50 total) of each plasmid preparation was digested with 
20 1 0 units each of Hind III and EcoR V (Gibco BRL) in a total volume of 1 5 ^L 
with React 2 buffer (Gibco BRL) for one h. (Note: The EcoRV site in the 
pBluescnpi polylinker was destroyed during the preparation of pGBPPD2 so only 
the EcoRV site in the pTTBlueR polylinker would be accessible to the restriction 
nuclease). Samples were separated electrophoretically in 1% agarose and 
25 tris/borate/EDTA (TBE) buffer (Maniatis). Bands were visualized with ethidium 
bromide staining; 7 out of 17 samples which contained 2 bands (2837 and 
1 525 bp) contained the /7-hydroxyphenylpyruvate dioxygenase msert and were 
designated pT7BlueR+PDOl (see Figure 4). 

In order to remove the putative chloroplast transit sequence, the remaining 
30 45 nL of each prep of pT7BlueR+PD01 were combined into a single sample and 
the DNA content determined spectrophotometricaily at A26O (Maniatis). A 
portion (5 yig) of pT7BlueR+PD01 was digested with 16 units of Eco47 III (MBI 
Fermentas) in a total volume of 100 ^iL containing buffer 0 (MBI Fermentas) at 
37 °C for 2 h. The digested plasmid DNA was then precipitated with sodium 
35 acetate and ethanol as above and the resulting dried nucleic acid pellet was 

dissolved in 60 |aL of React 2 (Gibco BRL) containing 20 units of Nde I (Gibco 
BRL) and incubated 2 h at 37 °C. The double digested sample was then loaded 
onto a 1% agarose gel in TAE and the large 4166 bp Nde I-Eco47in fragment 
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separated from the 196 bp fragment elcctrophoreticaily. The large fragment was 
cut out of the gel. purified from agarose and precipitated as above. 

An oligonucleotide mix was prepared consisting of 100 pmoles each of 
oligos CAM32 and CAM33 (SEQ ID NOS:4 and 5, respectively) in a combined 
5 volume of 9.9 ^L. The two oligos complement each other to form a 3' blunt end 
corresponding to the 5' half of an Eco47 III restriction site and also form a 5' 
staggered end which corresponds to the 3' half of an Nde I restriction site. 

CAM 32; (SEQ ID NO:4) 
10 5-TATGTCCAAGTTCGTA.AGAAAGAATCCAAAGTCTCATAAATTC.\AGGTTAAGC-3' 
CAM33:(SEO ID NO:5) 

5'-GCTTAACCTTGAATTTATCAGACTTTGGATTCTTTCTTACGAACTTGGACA-3' 

1 5 The oligo mix was heated to 90 °C for 1 .5 min and then allowed to cool to 

room temperature over 20 min. The dried nucleic acid pellet resultmg from 
purification of the 4166 bp Nde I-Eco47 III fragment was solublized in 7 mL of 
the cooled oligo mix and subsequently heated to 45 °C for 5 min followed by 
cooling on ice. Ligation of the oligos with the Nde l-Eco47 III fragment followed 
20 by transformation into DH5a was performed as above. Transformed bacterial 
cells were spread onto LB/carbenicillin plates and incubated at 37 °C overnight. 
Seventeen colonies were selected and processed to isolate plasmid DNA as above. 
A portion (5 out of 50 nL) of each plasmid was double digested with 10 units each 
of Nde I and Hind III and the fragments separated electrophorctically on a 1% 
25 agarose gel in TBE. A two band pattern corresponding to insert (1373 or 1518 bp) 
and vector (2844 bp) was detected. An additional double digest with 10 units 
each of Xba I and Hind HI was performed on another 5 |iL aliquot of plasmids. 
When digested with Nde 1 and Hind III. none of the plasmids which contained the 
smaller insert size contained a Xba I site. The Xba 1 site would be eliminated if 
30 the two oligos replaced the 196 bp fragment originally present in pT7Blue+PD01 . 
The 7 plasmid samples with the modified />hydroxyphenylpyruvate dioxygenase 
insen were combined and designated pT7BlueR+PD02. 

The pT7BlueR+PD02 plasmid DNA was quantified spectrophotometrically 
(above) and then 5 was digested with 20 units each of Hind III and Nde I in 
35 62 ^iL of React 2 for 2 h at 37 °C. The digested sample was subsequently loaded 
onto a 1% agarose gel in TAE and separated elcctrophoreticaily. The 1373 bp 
fragment was isolated and precipitated as above. The plasmid pET24c(+) (5 ^u) 
was double digested with 20 units each of both Nde 1 and Hind III in React 2 at^ 
37 °C for 2 h and the 5245 bp fragment then gel purified on a 1% agarose gel in 
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TAE and subsequently separated from agarose and precipitated as above. The 
dried pET24c(+) pellet was solublized in 10 ^iL TE and then 8 |iL was adjusted to 
a 20 total volume with water, dephosphor>'lation buffer (Gibco BRL) and 
1 unit of calf intestinal alkaline phosphatase (Gibco BRL). The sample was 
5 incubated at 37 °C for 30 min and then gel purified, separated from agarose, and 
precipitated as above. The dried, dephosphorylated, pET24c(+) vector pellet and 
modified />hydroxyphenylpyruvate dioxygcnasc insert pellet were each solublized 
in 10 |iL TE and then 1 |.iL of each was run on a 1% agarose TBE gel with 4 ^iL of 
mass ladder to quantify DNA as above. One hundred nanograms of modified 
10 /p-hydroxyphenylpyruvate dioxygenase insert was mixed with 120 ng of 

dephosphorylated pET24c(+) vector in a total of 7 |aL volume. The mix was 
heated to 45 ""C for 5 min and then cooled on ice. The mix was then supplemented 
with T4 DNA ligase buffer and 1 unit of T4 DNA ligase in a total volume of 
10 and the mix allowed to incubate at room temperature for 4 h. The ligation 
1 5 mix was subsequently transformed into DH5a, spread on LB agar supplemented 
with 30 |.ig/mL kanamycin. and incubated overnight at 37 ^'C. Plasmid 
preparations were performed on I 1 colonies as above. Plasmids were double 
digested with Nde 1 and Hind III and fragments separated electrophoretically. All 
plasmids had the expected 1373 bp and 5245 bp fragments. One bacteria colony 
20 was selected and used to inoculate 100 mL of liquid LB supplemented with 

30 ^g/mL kanamycin which was subsequently incubated at 37 °C overnight with 
shaking. Plasmid DNA was isolated from the resulting bacteria culture using a 
Qiagen Plasmid Midi Kit according to the manufacturer's instructions. A portion 
of the plasmid DNA (pE24CPl) was sequenced with the Sequenase Version 2.0 
25 DNA Sequencing Kit (United States BiochemicaL Cleveland, OH) using a 

biotinylated sequencing primer to the T7 promoter (United State Biochemical) 
according to the manufacturer's instructions for non-radioactive manual 
sequencing. DNA was transferred from the sequencing gel to Hybond-N+ nylon 
transfer membrane (Amersham. Arlington Heights, IL) by capillary action. 
30 Transfer and all subsequent steps in chemiluminescent detection of DNA 

fragments were performed with a SEQ-Light Chemiluminescent Sequencing 
System kit (Tropix, Bedford, MA) according to the manufacturer's instructions. 
DNA sequencing verified that the plasmid contained the expected 5' sequence for 
the modified p-hydroxyphenylpyruvate dioxygenase insert where nucleotides 1-95 
35 (Figure 2) were replaced with an ATG transcriptional start site. This is equivalent 
to amino acids 2-29 (Figure 3) being eliminated from the N-terminus of the 
Arahidopsis /?-hydroxyphenylpyruvatc dioxygenase amino acid sequence. 
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The plasmid pE24CPl was transformed into competent cells of BL21(DE3) 
E. coll fNovagen), as above. Transformed cells were spread on LB/kanamycin 
plates and incubated overnight at 37 X. Seven colonies were selected for plasmid 
preparations as above and plasmid DNA was double digested with Nde i and 
5 Hind III to verify that all plasmids had the expected electrophorctic banding 
pattern. One colony was selected and streaked for isolation on LB/kanamycin 
plates. A well isolated colony was used to inoculate liquid LB supplemented with 
30 |ig/mL kanamycin and the culture was incubated at 37 with shaking 
(250 rpm) until it reached an A^qo of 0.6 absorbancc units. An 8% glycerol 
1 0 freezer stock was prepared according to the Novagen protocol and stored at 

-80 °C. All subsequent expression studies were done with freshly grown bacterial 
cells that -were isolated from LB/kanamycin plates streaked from the glycerol 
freezer stock. 

BL2](DE3) E. coli cells containing either pE24CPl or pET24c(-) (negative 

15 control) were streaked out onto LB/kanamycin plates from a glycerol freezer stock 
(above) and incubated overnight at 37 °C. One isolated colony was selected for 
inoculation of 2 mL of LB containing 30 Mg/mL kanamycin m a 17 x 100 mm 
Falcon tube, and the culture was incubated at 37 with shaking (250 rpm) 
overnight. The overnight cultures were then used to inoculate 100 mL of fresh LB 

20 containing 30 fig/mL kanamycin. The new cultures were incubated at 37 with 
shaking until the A^qo reached between 0.4 and 0.6 absorbance units. One half of 
the pE24CPl and pET24c(+) cultures were placed in new culture flasks and IPTG 
(isopropylthio-P-D-galactoside: Gibco BRL) was added to the new flasks to give a 
final concentration of I mM. The flasks were incubated an additional 3 h at 37 

25 with shaking, and then the cells were harvested. 

The harvested cells were centrifuged and the resulting cell pellet extracted 
by sonication (3 x 10 sec bursts) in 2 mL extraction buffer (50 mM (20 mM in the 
first experiment; Table 2) potassium phosphate buffer, pH 7.2, containing 0.14 M 
KCl, 0.32 mM reduced glutathione, 1% polyvinylpolypyrrolidone. and 0.1% 

30 Triton X 100 (0.01% lysozyme was included in the first experiment only)). The 
lysate represents the crude extracted enzyme after centrifugation at 17000 g for 
10 min. In the first experiment (Table 2) a 20 to 60% ammonium sulfate 
precipitated enzyme fraction was also assayed. Solid ammonium sulfate was 
slowly added with stirring to 2 mL of the lysate to bring the concentration to 20% 

35 (w/v). After incubation on ice for approximately 1 5 min, the solution was 

centrifuged at 17000 g for 10 min. The supernatant liquid was harvested and solid 
ammonium sulfate was added to increase the concentration to 60% (w/v). After 
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centrifligation, the resulting pellet was resuspcnded in 1 mL of the extraction 
buffer. 

A portion of the insoluble protein resulting from expression of Arahidopsis 
/7-hydroxyphenylpyruvate dioxygenase in bacteria was utilized for N-terminal 
5 sequence analysis. The protein (approximately 1 80 fig) was suspended in 60 
of extraction buffer and then diluted with 5 volumes of sample buffer (62.5 mM 
Tris, pH 6.8, 6 M urea, 160 mM dithiothreilol, 0.01% bromophenol blue) 
followed by intermittent vonexing for one hour at room temperature. A 1 .5 mm 
thick. 12% polyacrylamide resolving gel was prepared for a Mini-Protein II dual 

10 slab cell (Bio-Rad, Hercules, CA) using the manufacturer's instructions. The 
polyacrylamide was allowed to polymerize for 3 h and then a stacking gel was 
prepared using a preparative comb. The running buffer was prepared according to 
the manufacturer's instructions with the addition of 0. 1 mM sodium thioglycolatc. 
The soiubltzed protein sample was electrophoretically separated using the 

1 5 manufacturer's instructions. When the bromophenol blue dye front reached the 
bottom of the gel. the gel was removed and equilibrated for 5 min in blotting 
buffer (10 mM CAPS, pH I K 10% methanol, balance water). The gel was then 
placed in a Mini Trans-Blot Electrophoretic Transfer Cell (Bio-Rad), according to 
the manufacturer's instructions, with a ProBlott PVDF membrane (Applied 

20 Biosystems, Foster City, CA) treated according to the manufacturer's instruction. 
Electroblotting was done in the presence of blotting buffer at 50 volts for 45 min 
in an ice bath. The membrane was then rinsed in water and stained with 
Coomassie Blue as described in the ProBlott protocol. The major protein band 
was excised from the membrane and subjected to N-terminal amino acid 

25 sequencing on a Beckman (FuUerton, CA) LF3000 protein sequencer. The first 
1 1 cycles identified S-K-F-V-R-K-N-P-K-S-D (see SEQ ID N0:3, amino acids 
30-40), respectively. This is the expected N-terminus of the modified Arabidopsis 
/?-hydroxyphenylpyruvate dioxygenase minus the initial methionine (amino acids 
30-40, Figure 3). 

30 EXAMPLE 3 

p-Hvdroxvphenvlpvruvate Dioxvcenase Enzvmatic Activitv 
of the Plant Protein Expressed in E. Coli 
Cell cultures with different plasmid constructs were extracted as described 
above and assayed by measuring the formation of ^^C02 from 

35 [l-^'^CJ-p-hydroxyphenylpyruvate or "^CO^ and ^'^C-homogentisate from 
[U-^4c]-/7-hydroxyphenylpyruvaie (Lindblad, B., (1971) Clin. Chim. Acta 
34:1 13-121; and Lindstedt, S. and Odelhog, B., (1987) Methods in Enzymology^ 
142:143-148). The labeled substrate was prepared from [l-'^^Cj-L-tyrosine 
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(55 mCi/mmoI; American Radiolabeled Chemicals. Inc.. St. Louis. MO) or 
[U-'4C]-L-tyrosine (498 mCi/mmol; DuPont NCN. Uoston. MA). A 50-100 (iL 
aliquot (5-10 nCi) of the of the labeled tyrosine stock solution was transferred to a 
4 mL glass vial and blown to dr\'ness in a stream of nitrogen at 45°C. To the vial 
5 was added 1 75 of 0. 1 M phosphate buffer. pH 6.5. 5 nL catala.se (28.700 units 
of C- 100. Sigma Chemical Co.. St. Louis. MO), and 20 ^L L-amino acid oxidase 
(Sigma A.9253. 6.5 units/mL). The vial was then placed on a shaker water bath 
set at BO-'C. 60 cycles/min. for 0.5 to 1 h. The reaction mix was then passed 
through a small column containing 400 ^L Dowex AG 50\V X8 cation exchanue 
1 0 resin. The column was then washed with 1 .5 mL of water and the eluant 

containing the labeled p-hydroxyphenylpyruvate was collected. The labeled 
substrate was either used immediately or .stored at -X0°C and used within a week 
after preparation. 

The assay was performed in 14 mL culture tubes capped with serum 
1 5 .stoppers through which a polypropylene well containing 200 pL of 1 N KOH was 
suspended. The reaction mixture contained 5.740 units of catalase. 100 uL of a 
freshly prepared 1 : 1 (v:v) mixture of 1 50 mM reduced glutathione and 3 mM 
dichlorophenolindophenol. 5 mM ascorbate. O.I mM ferrous sulfate (the ascorbatc 
and ferrous sulfate were not present in the buffer used in the first experiment: 

20 Table 2). 50 ^M unlabeled /^-hydrcxyphenylpyruvate. 1 -25 pL of the enzyme 
extract, and 50 mM potassium phosphate buffer in a final volume of 080 ^iL. 
Unlabeled substrate was made fresh daily in 50 mM potassium phosphate buffer 
and allowed to equilibrate for at least 2 h at room temperature to insure that 
greater than 95% was in the kcio form. The tubes were incubated for 1 0 min at 

25 30°C in a shaking water bath prior to adding 20 uL (0.04 |.iCi) of 

'^C-/7-hydroxyphcnylpyruvate. The reaction was terminated after 60 min by 
injecting 500 ^l of 1 N sulfuric acid through the serum stopper. The vials were 
left on the shaker for another 30 min to insure complete capture of the released 
'''COi. The serum caps were then removed and the wells cut and dropped into 

30 8 mL scintillation vials. Six mL of Formula-989 scintillation fluid (Packard 

Insturments. Meriden, CT) was added to the vials and the l^C radioactivity was 
determined by scintillation counting. Table 2 summarizes the results of this 
experiment. 
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Table 2 

p-Hydroxyphenylpyruvate Dioxygenase Activity of Extracts from 
£. co// Containing Different Plasmid Constructs 



Plasmid 


Inducer 
(1 mM IPTG) 




Lysatc _ 
nmol/min x mc 


Ammonium Sulfate Precipitate 


dpm * fm'j. 


dpm * /mc 


nmol/min .\ mi: 


pET24c(+) 




12.3)8 


0.09 


0 


0.00 


pET24c(+) 


-t- 


35,115 


0.25 


3,393 


0.03 


pE24CPl 




24,607 


0.17 


126,761 


0.89 


pE24CPl 


+ 


243,801 


1.71 


1.371,823 


9.64 



* '"^C : ^-^C = 1 : 50; sp. act. of ' ^C-/7-hydroxyphenylpyruvatc = 55 mCi/mmol 

5 

The results show there was little or no p-hydroxyphenylpyruvate 
dioxygenase activity in any of the cell cultures that did not have the plasmid 
containing the nucleic acid fragment encoding p-hydroxyphenylpyruvate 
dioxygenase (pET24c(+)) and the inducer of gene expression ( IPTGV The gene 

10 and inducer together resulted in a marked increase in activity. 

In the experiment with [U-^^C] /?-hydroxyphenylpyruvate ('iiPPA^'), where 
both l^^CO-^ and I'^C-homogentisic acid were measured, the reaction was initialed 
by adding 50 \iL of labeled substrate (0.3 uCi) and was terminated with 100 ^L of 
10% phosphoric acid. The '^C02 released was determined by scmtillation 

1 5 counting, while the level of homogentisic acid was determined by HPLC on a 
Zorbax RX-C8 column (4.6 x 250 mm) with an in-line radioactivity detector. 
Aliquots of 1.7 to 15 mL were taken from the reaction mix after centrifugation and 
diluted into the column equilibration buffer prior to injection. Separation was 
performed at ambient temperature with a flow rate of 1 .0 mL/min and the 

20 following gradient with solvent A and B being water and methanol, each with 1 S o 
phosphoric acid; 0-2 min, isocratic at 95% A and 5% B; 2-17 min. linear gradient 
from 95 to 75% A and 5 to 25% B; 17-19 min linear gradient from 75 to 5% A 
and 25 to 95% B; 19-22 min. isocratic at 5% A and 95% B; 22-24 min. linear 
gradient from 5% to 95% A and 95 to 5% B. In this system homogentisate etuted 

25 at 10.8 min. The results from this experiment are shown in Table 3. 
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Table 3 



p-Hydroxyphenylpyruvate Dioxygenase Activity of Cell Extracts 
Determined by C02Reiease and Homogentisic Acid Synthesis 
from [U-'^^C] yC-Hydroxyphenylpyruvate 





Inducer 


nmoJ/nn 


lin X mg* 


Plasmid 


(I mM iPTG) 




Homogentisic acid 


pET24c(4-) 




0.00 


0.00 


pET24c(+j 




0.19 


0.00 


pE24CP] 




4.68 


4.76 


PE24CP1 


4- 


29.12 


29.82 




l : 87.7; sp. act. 


of '^C[U]-/j-HPPA 


= 498 mCi /mmo! 



There was a tight correlation between the results from the assays of the two 
products of the reaction. The results confirmed there was no significant 
1 0 /7-hydroxypheny!pyruvate dioxygenase activity in cither cell culture that did not 
contain the nucleic acid fragment encoding /7-hydroxyphenylpyruvate 
dioxygenase. There was measurcable enzyme activity in the absence of the 
inducer, but when the inducer was added the activity increased greater than six- 
fold over uninduced cultures. These results and those of Table 2 clearly show that 
1 5 the nucleic acid fragment isolated and overexpressed in E. coli cells encodes a 
protein that catalyzes the conversion of p-hydroxyphenylpyruvate to 
homogentisate with the release of CO->. 

The overexpressed protein was also assayed spectrophotometrically at 
ambient temperature using the enol borate-tautomcrase assay (Lin. E. C. C. et aL, 
20 (1958)7. DioL Cham. 233:668-673). The assay buffer contained 0.4 M borate 

(adjusted to pH 7.2 with 0.2 M sodium borate), 4 mM ascorbate. 2.5 niM EDTA. 
40 MM/>hydroxyphenylpyruvate, and 0.5 units of tautomcrase (Sigma T-6004) 
per 10 mL buffer. The reaction mix was used when the tautomerization of the 
substrate was complete (when absorbance at 308 nm had stabilized). The assay 
was initiated by adding 40 of the cell extracts to 960 ^L of the assay buffer, 
and the reaction was followed by measuring the decrease in absorbance at 308 nm. 
Table 4 summarizes the results with extracts of the same four cell cultures 
described in Table 3. 
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Table 4 

Spectrophoiomeiric Assay of /^Hydroxyphenylpyruvaie 
Dioxygenase Activity of Cell Extracts 



Plasm id 


Inducer 
(1 mM IPTG) 


nmol p'HP lost/min x mg* 


pET24c(+) 




1.58 


pET24c(+) 




2.73 


pE24CPl 




4.91 


pE24CP! 


+ 


22.32 



5 * Loss of /7-hydroxyphcnylpyruvate based on a moiar extinction 

coefficient for the equilibrium mixture of 9850 as reponed by 
Lin ct al. ((1958)./ Bio/. Cham. 233: 668-673). 

EXAMPLE 4 

10 Inhibition of ^-Hvdroxvphenvlpvruvate Dioxvgcnasc bv Commercial Herbicides 
The enzymatic activity of the overexpresscd protein is inhibited by two 
herbicides known to inhibit plant /7-hydroxyphenylpyruvate dioxygenase: 
Sulcotrione (2-(2-chloro-4-melhanesulfonylbcnzoyl)-l ,3-cyclohexanedione); and 
Isoxaflutole (5-cyclopropylisoxazol-4-yl 2-mesyi-4-trifluoromethylphenyl 

1 5 ketone). These two compounds were tested against the overexpressed protein 
using both the I'^COt and the continuous spectrophotometric enol borate- 
tautomerase assays. Both compounds were added to the assay buffers in 10 of 
acetone or dimethyl sulfoxide. The I50 values (concentration inhibiting the 
enzyme 50%) were calculated based on the percent inhibition observed over 

20 several concentrations of the inhibitor. The results of the assays are shown in 
Table 5. 

Table 5 

Ijt) Values of Inhibitors of Plant /7-Hvdroxyphenylpyruvate Dioxygenase 

25 ' ' 





^50 value (n 


M) derived from 


Compound 


*^CO-» assav 


spectrophotometric assay 


sulcotrione 


43 


44 


isoxaflutole 


409 


1042 



These results clearly show that the p-hydroxyphenylpyruvate dioxygenase 
activity of the overexpressed protein is inhibited by commercial herbicides that 
have inhibition of this enzyme as their mode of action. Moreover, the continuous 
30 spectrophotometric assay gave similar 1 50 values to those obtained with the ''*C02 
assay. The spectrophotometric assay can be adapted to a high capacity screen for 



wo 97/49816 PCT/US97/1 1295 

inhibitors of /7-hydroxyphenytpyruvate dioxygenase by adapting it to a microliter 
plate assay combined with a plate reader that would read at or near 308 nm. 
Furthermore, any colorimctric or tluorcscenl assay for homogcntisate. or 
/?-hydroxyphenylpyruvatc would also be able to be readily adapted into a hi eh 
5 capacity screen for inliibitors of this enzyme. The isolated ovcrexprcssed enzyme 
has sufficient activity to be used directly in a spectrophotometric assay or it can be 
funher purified for enJianccd assay sensitivity. 

EXAMPLE 5 

Re-construction of the Ful l-length n-HvdroxvnhenvlDvruvate Dioxveenasc Gene 
'^-^ for Production of Active. Stable Enzvme in Bacteria 

The plasmid pT7BlueR+PD02, described in Example 2 and containing the 
full-length /?-hydroxyphenylpyruvate dioxygenase gene, proved to have incorrect 
sequence at the EcoRI site. This was rc-sequenccd so that an oligonucleotide 
could be designed to replace the EcoRI site with an Ndel site using conventional 

1 5 ioop-out mutagenesis. The oligonucleotide was designed so that this procedure 
also introduced an ATG initiation codon at the 5'- end of the p-hydroxyphenyl- 
pyruvatc dioxygenase gene followed by the full-length /?-hydroxyphenyIpyruvatc 
dioxygenase sequence. After mutagenesis, the clone w-as amplified in E. coli and 
the plasmid was purified. The resulting full-length gene. "PDO-B", was then 

20 digested with the enzymes using Ndel and Nhcl, and the ^820 bp fragment used to 
replace the Ndel - Nhc 1 segment of the truncated /?-hydroxyphcnylpyruvate 
dioxygenase gene. ^TDO-A/' in pE24CPl (Example 1). The resulting plasmid. 
pE24PDO-B can be expressed in bacteria to produce the full-length Arahidopsis 
/:?-hydroxyphenylpyruvate dioxygenase enzyme as determined by enzynie activity 

25 and N-terminal sequence analysis. 

EXAMPLE 6 

Enhance d Stability of Full Length Construct Over the Truncated Construct 
Two different constructs for Arabidopsis (haliana /7-hydroxyphenyl- 
pyruvate dioxygenase. one containing the full-length sequence. PDO-B as 

30 described in Example 5 and produced from plasmid pE24PDO-B, and one 
containing the truncated sequence lacking the putative chloroplast leader 
sequence. PDO-A as produced from plasmid pE24CPL were both purified to the 
same extent using a Pharmacia phenyl Sepharose hydrophobic interaction column 
followed by gel filtration chromatography on Pharmacia Sephacr>1 300. The two 

35 proteins were diluted to 1 mg/mL in 20 mM bis iris-propane buffer, pH 7.2 
containing 5 mM ascorbaie, I mM reduced glutathione and 0.1 mM ferrous 
ammonium sulfate and stored in a refrigerator at 4 °C for up to 10 days. Aliquois 
were removed at various limes and assayed for activity using the tautomerase 
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coupled spectrophotomctric assay. Under these conditions the half-life for the 
activity of the full length enzyme was 4 days, whereas the truncated enzyme 
preparation had a half-life of 9 to 10 hours. In addition, the activity of the full 
length enzyme could be restored by incubation with iron and reducing agent. 
5 reduced glutathione or ascorbatc, or by dialysis against buffer containing iron and 
reducing agent. In contrast, the activity of the truncated enzyme could not be 
restored by incubation with or dialysis against buffer containing iron and reducing 
agent. The full-length enzyme was also more stable in the spectrophotomctric 
assay showing a 2 to 3 limes longer useful linear region than the truncated 

10 enzyme. Both enzyme preparations showed similar I50 values with the 
herbicidally active inhibitors. 

These results clearly show that the full-length PDO-B construct has 
decided advantages over the truncated enzyme due to the enhanced stability under 
storage conditions, in the spectrophotomctric assay and in the reversible 

1 5 reconstitution of activity in the presence of iron and reducing agent. Wliiie both 
enzyme constructs can be used for screening of inhibitors, the PDO-B enzyme is 
preferred for this application and is far superior for mechanistic and structural 
studies. 

EXAMPLE 7 

20 Cloning of the Maize ^-Hvdroxvphenvlpvruvate Dioxvgenase Gene 

Approximately 600.000 plaques of a Stratagene maize Uni-Zap cDNA 
library (from young plants) were screened by filter hybridization under moderate 
stringency using a heterologous probe. The probe was prepared by PCR and was 
a 916 bp fragment of DNA having the sequence defined by the region extending 

25 from position 263 to 1 178 of SEQ ID NO; 14. Twenty-four positive phage clones 
were identified in the primary screen, and eleven phage clones were recovered 
from a secondary screen. Seven positive clones were submitted for sequencing, 
and four showed significant conservation sequence at the amino acid level when 
compared with the Arahidopsis thaliana /^-hydroxyphenylpyruvate dioxygenasc 

30 protein. The longest of the four contained an insert of 988 bp and showed 70% 
identity and 78% similarity with the Arabidopsis protein, but was lacking 
approximately 550 bp corresponding to the amino terminal end of the protein. 

Attempts to obtain a full-length cDNA of the maize /7-hydroxyphenyl- 
pyruvate dioxygenase gene were unsuccessful, possibly because the secondary' 

35 structure of the RNA inhibited efficient reverse transcription of this transcript. 

Two additional cDNA libraries were screened and clones long enough to contain a 
full-length cDNA were sequenced. All of these clones were shown to be 
chimeras. Therefore a genomic library was screened to obtain the 5' one-third of 
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the gene. Approximately I million clones from a Clontech Zca mays (var. B73) 
library in the phage vector EMBL3 (whole seedlings, 2 leaf stage) were screened 
using a 415 bp EcoRI-BssHII fragment containing the 5' end of the truncated corn 
/?-hydroxyphenylpyruvatc dioxygenase cDNA (clone HlOllC). Eight positive 
5 primary phage clones were plated and screened, and four secondan.- clones were 
picked. DNA was prepared from each using the Qiagen Lambda midi-kit. 
Restriction digests with Sail or EcoRl indicated that two clones were the same. 
DNA samples from the remaining 3 clones (11.1.3, 1 3. 1 . 1 , and 2 K2. 1 ) were 
digested with Sail. EcoRI, or Sail and EcoRl, prepared for Southern analysis, and 

1 0 probed with the full length Arahidopsis /;-hydroxyphenylpyruvate dioxygenase 
gene. Two of the clones ( 1 1 . 1 .3 and 1 3. 1 . 1 ) showed sequence conservation, and 
these homologous fragments were subcloned and sequenced. Both clones 
appeared to contain the full-length gene and each contained one intron near the 3^ 
end of the gene. However, there were differences between the sequences of the 

1 5 two clones indicating that they may be two different genes or one may be a 

pseudogene- The sequence of clone 11.1.3 matched the cDNA sequence, and this 
clone was used to construct a full length /?-hydroxyphenylpyruvate dioxygenase 
coding region. 

The gene was contained on two adjacent fragments, a 3.5 kb EcoRI - Sail 

20 fragment and a 2 kb Sail fragment. Both were subcloned into pBluescript SKII-f 
resulting in the plasmids pESl 113 and pSall 1 113. pESl 1 13 was digested with 
Spel to release approximately 2.7 kb of upstream sequence and then religated, 
resulting in a plasmid with an insert of 747 base pairs (pSPEl). pSPEl was 
digested with Sail to linearize the plasmid and ligated with the 2 kb Sail fragment 

25 from pSall I 13, which had been released by digestion with Sail and gel purified. 
Orientation was confirmed by digestion with Spel and Bpul 1021 and the correct 
plasmid was named pi 1 13. In order to remove the intron contained in the 3' end 
of the genomic clone, the plasmid was digested with Bpul 1021 and Xhol and the 
3.9 kb fragment containing the vector and 5' part of the gene was gel purified. 

30 The corresponding 882 bp Bpu 1 1 021-XhoI fragment from pH 1 0 1 1 c ( cDN A)was 
gel purified and ligated with this 3.9 kb fragment resulting in the clone pMPDO 
(ATCC 209120), which contains a 1 782 bp insert. There are 260 base pairs 
upstream of the putative ATG and 1 89 base pairs downstream of the stop codon. 
The full-length sequence was confirmed by sequencing across the insert. The 

35 nucleic acid sequence and the deduced protein sequence for com 

/7-hydroxyphenylpyruvate dioxygenase are presented in SEQ ID NOS: 10 and 1 1 
respectively. The sequences for /7-hydroxyphenylpyruvate dioxygenases obtained 
from com and Arahidopsis were compared using the "Gap" program of GCG 
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(Program Manual for the Wisconsin Package, Version 9.0-OpenVMS, December 
1996. Genetics Computer Group. 575 Science Drive, Madison. Wl, USA 5371 1). 
The results of these comparisons indicated that these functions are approximately 
67% identical at the nucleotide level, and they possess 69% similarity and 62% 
5 identity at the amino acid level. The predicted amino acid sequence of com 
/^-hydroxyphenylpyruvate dioxygcnase is compared with that from Arahidopsis 
and other eukaryotes in Figure 3. 

EXAMPLE 8 

Composition ofa cDNA Library: Isolation and Sequencing of cDNA Clones 
10 A cDNA library representing mRNAs from developing seeds of Vernonia 

galamenemis that had just begun production of vernolic acid was prepared. The 
library w^as prepared in a Uni-ZAP™ XR vector according to the manufacturer's 
protocol (Stratagene Cloning Systems, La Jolla, CA). Conversion of the 
Uni-ZAP"^^ XR library into a plasmid library was accomplished according to the 
15 protocol provided by Stratagene. Upon conversion. cDNA inserts were contained 
in the plasmid vector pBlucscript, cDNA inserts from randomly picked bacterial 
colonies containing recombinant pBluescript plasmids were amplified via 
polymerase chain reaction using primers specific for vector sequences flanking 
the inserted cDNA sequences. Amplified insert DNAs were sequenced in dye- 
20 primer sequencing reactions to generate partial cDNA sequences (expressed 

sequence lags or ^'ESTs \ see Adams, M. D. et dA..{\99\) Science 252:\65\), The 
resulting ESTs were analyzed using a Perkin Elmer Model 377 fluorescent 
sequencer. 

EXAMPLE 9 

25 Identification and Characterization of cDNA Clones 

ESTs encoding Vernonia galamenensis enzymes were identified by 
conducting BLAST (Basic Local Alignment Search Tool: AltschuU S. F. et al., 
(1993)./ Moi Biol. 215:403-410; see also www.ncbi, nlm.nih.gov/BLAST/) 
searches for similarity to sequences contained in the BLAST ''nf' database 

30 (comprising all non-redundant GenBank CDS translations, sequences derived 

from the 3 -dimensional structure Brookhaven Protein Data Bank, the last major 
release of the SWISS-PROT protein sequence database. EMBL. and DDBJ 
databases). The cDNA sequences obtained in Example 9 were analyzed for 
similarity to all publicly available DNA sequences contained in the "nr"' database 

35 using the BLASTN algorithm provided by the National Center for Biotechnology 
Information (NCBI). The DNA sequences were translated in all reading frames 
and compared for similarity to all publicly available protein sequences contained 
in the "nr" database using the BLASTX algorithm (Gish, W. and States, D. J. 
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(1993 ) Nature Genetics 3:266-272) provided by the NCBl. For convenience, the 
P-value ( probability) of observing a match of a cDNA sequence to a sequence 
contained in the searched databases merely by chance as calculated by BLAST are 
reported herein as "pLog" values, which represent the negative of the logarithm of 
5 the reported P-valuc. Accordingly, the greater the pLog value, the greater the 
likelihood that the cDNA sequence and the BLAST "hit" represent homologous 
proteins. 

The BLAST.X search using clone vsl pkOOl 5.b2 revealed similarity of the 
protein encoded by the cDNA to a number of/j-hydroxyphenyipyruvate 

1 0 dioxygenases from sources other that plants. The three most similar p-hydroxy- 
phenylpyruvate dioxygenase proteins were a streptomycete /7-hydroxyphenyl- 
pyruvate dioxygenase (GenBank Accession No. Ul 1864; pLog = 8.34). a rat 
/7-hydroxyphenylpyruvate dioxygenase (GenBank Accession No. Ml 8405; 
pLog = 7.66). and a human ;?-hydroxyphenylpyruvate dioxygenase (GenBank 

1 5 Accession No. U29895; pLog = 7.60). SEQ ID NO: 1 6 shows the nucleotide 
sequence of a ponion of the Vcrnonia gaUimenensis cDNA in clone 
vsl.pk0015.b2. Sequence alignments and BLAST .scores and probabilities 
indicate that the instant nucleic acid fragment encodes a portion of Vernonia 
^alamenensis /7-hydroxyphenylpyruvatc dio.xygenase. 

20 
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SEQUENCE LISTING 
(i) GENEPJ\L IMEORMATIOM: 

(i) APPLICANT: 

(A) NAME: E. I, DUPONT DE NEMOURS AND COMPANY 

(B) STREET: 1007 MARKET STREET 

(C) CITY: WILMINGTON 

(D) STATE: DELAWARE 

(E) COUNTRY: U.S.A. 

{E) POSTAL CODE iZlV): 19696 

(G) TELEPHONE: 1- O;: - o 92 -3 11 2 

(H) TELEEA:-:: J02-771-016^ 

(I) TELEX: 6717325 

(ii) TITLE OF INVENTION: FLANT GENE FOR p-HiDROXY- 

PHENYL? Y R U V ATE D I O X Y G E N A 5 E 

{ill J NUMBER OF SEQUENCE:^: 16 

{ i V ) COM P UT E R R E A D A B L E FORM : 

{A} MEDIUM TYPE: ::iSKETTE, 3.50 INCI: 
{E} COMPUTEt^. : IBM PC COMPATIBLE 

.;Ci OPERATING SYSTEM: MICROSOFT WORD FOR WINDOV;S 
iD) SOFTVJARE: MICROSOFT WORD VERSION 7 . OA 

(V) CURRENT APPLICATION DATA: 
(A) APPLICATION NUMBER: 
.:B) FILING DATE: 
:C) CLASSIFICATION: 

(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: nO/021,36^ 

(B) FILING DATE: JUNE 2"/, 1996 

(vi I) ATTORNEY/AGENT INFORMATION: 

(A) NAME: FLOYD, LINDA AX/VMETHY 
iS] REGISTRATION NUMBER: 33,692 

(C) REFERENCE/DOCKET NUMBER: BA-9120 
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:2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHAFJ\CTEBIGT I CS : 

lAI LENGTii: 233 base pairs 
(B) TYPE: nucleic acid 
;Cj STPANDEDN'EGS : single 
iD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

CAAGAAACGN GTCGNCGACG TGCTCAGCGA TGATCAGATC AAGGAGTGTG AGGAATTAGG 6 0 

GATTCTTNTA GACAGAGATG ATC.z^^GGGAC GTTNCTTCAA ATCTNCAC.^A AACCACTAGG 120 

TGACAGGCCG ACGNTATTTA TAGAGATA^^T CCAGAGNGTA GGATGCATGA TG.c^AAGATGT ISO 

GGAAGGGANG GCTTACCAGA GTGGAGNATN TNGTCGTTTT GGCAA/vGGCA ATT 2 33 

(2) INFORMATION FOR SEQ ID NO : 2 : 

(i; SEQUENCE CHARACTERISTICS: 

'A) LENGTH: i4hS base pairs 
;a} TYPE: nucleic acid^ 
■:C; STRANDEDNESS : smqie 
:D} TOPOLOGY: linear " 

(ii; MOLECULE TYPE: cDNA 

(!>:} FEATURE: 

;A) NAME/KEY: CDS 

(B) LOCATION: 9. .13^3 

(xi) SEQUENCE DESCRIPTION: SEQ iO NO : 2 : 

TGAAATCA ATG GGC CAC CAA AAC GCC GCC GTT TCA GAG PJKT CAA AAC CAT 5 0 

Met Gly His Gin Asn Ala Ala Val Scr Glu Asn Gin Asn His 
1 5 10 

GAT GAC GGC GCT GCG TCG TCG CCG GGA TTC AA.G CTC GTC GGA TTT TCC 9G 
Asp Asp Gly Ala Ala Sor Ser Pro Civ Phe Lvs Leu Val Glv Phe S^r 

20 ' 2 5 ' 30 

AAG TTC GTA AGA A7.G AAT CCA AAG TCT GAT A/-.A TTC AAG CTT AJ^G CGC 1^6 
Lys Phe Val Arq Lyti Asn Pro Lys Ser Asp Lys Phe Lys Val Lys Arg 

'iC ' ^,5 

TTC CAT CAC ATC GAG TTC TCG TGC GGG GAC CCA ACC AAC GTC GCT CGT 194 
Phe His His lie Glu Phe Trp Cys Gly Asp Ala Thr Asn Val Ala Arc 
50 55 60 

CGC TTC TCC TGG GGT CTG GGG ATG AGA TTC TCC GCC AAA TCC GAT CTT 24 2 

Arg Phe Ser Trp Gly Leu Gly Met Arg Phe Ser Ala Lys Ser Asp Leu 
65 70 75 . 

TCC ACC GGA AAC ATG GTT CAC GCC TCT TAC CTA CTC ACC TCC GGT GAA 2 90 

Ser Thr Gly Asn Met Val His Ala Ser Tyr Leu Leu Thr Ser Gly Glu 
80 85 90 

CTC CCA TTC CTT TTC ACT GCT CCT TAC TCT CCG TCT CTC TCC GGC GGA 338 
Leu Arg Phe Leu Phe Thr Ala Pro Tyr Ser Pro Scr Leu Sor Gly Gly 
95 100 i05 110 
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GAG ATT PvAA 
Glu lie Lys 



TCT TGT CGG 
Ser Cys Arg 



GCG ATT GAA 

Ala lie Giu 
145 

AAT GGC GCT 

Asn Glv Aia 
160 

ACG ATC GCT 

Thr lie Aia 
175 

AGT TAG AAA 

Ser Tyr Lys 



IjAG Co. ^tI a 
Giu Ara Vai 



CGG CT7 GAC 
Arg Lea Asp 
225 

ACT TAT GTA 
Thr Tyr Vai 
240 

GCA GAC GAC 
Aia AsD AsD 
255 

GCT AGC AA.T 
Aia Ser Asn 



GGA AC A AAG 
Giy Thr Lys 



GGC GCA GGG 
Giy Aia Giy 
305 

ACC CTG AGA 
Thr Leu Arg 
320 

ATG CCT TCT 
Met Pro Ser 
335 

GGC GAC GTG 
Giy Asp Vai 



CGG ACA ACC 

Pro Thr Thr 
115 

'JTT' T'J''^ 'Y'^C 

Ser Phe Phe 
130 

GTA GAA GAC 

Vai Glu Asp 

ATT CCT TCG 

He Pro Ser 



GAG GTT .^AA 

Glu Vai Lys 
180 

GCA GAA GAT 

Ala Glu Asp 
195 

GAG GAT GCG 

Glu AsD Aia 
210 

CAC GCC GTG 

His Aia Vai 



GGG GGG TTC 
Ala Giy Phe 



GTT GGA ACC 
\-ai Giy Thr 
260 

GAT GAA ATG 
A=c Glu Met 
275 

AGG AAG AGT 
Arg Lys Ser 
290 

C7A C?<J\ CAT 
Leu Gin His 



GAG ATG AGG 
Giu Met Arg 



CCT CCG CCT 
Pro Pro Pro 
340 

CTC AGC GAT 
Leu Ser Asp 
355 



ACA GGT TCT 
Thr Giy Ser 



TCT TCA CAT 
Scr Scr His 



GCG GAG TCA 

Aia Giu Ser 
150 

TCG CCT CCT 

Ser Pro Pro 
165 

CTA TAG GGC 

Leu Tyr Giy 



ACC GAA AAA 
Thr Glu Lys 



TCG TCG TTC 
Ser Ser Phe 
215 

GGA F'JkC GTT 
Giy Asn Vai 
230 

ACT GGT TTT 
Thr Giy Phe 
245 

GCC GAG AGC 
Aia Glu Ser 



GTT GTT CTA 
Vr^. 1 Leu Leu 



CAG ATT CAG 
Gin lie Gin 
295 

CTG GCT CTG 
Leu Aia Leu 
310 

AAG AGG AGC 
Lys Arg Ser 
325 

ACT TAG TAC 
Thr Tyr Tyr 



GAT CAG ATC 
Asp Gin lie 



ATC CCA AGT 
He Pro Ser 
120 

GGT CTC GGT 
Giy Leu Giy 



GCT TTC TCG 
Ala Phe Ser 



ATC GTC CTC 
lie Vai Leu 
170 

GAT GTT GTT 
Asp Vai Vai 
185 

TCG GAA TTC 
Ser Giu Phe 
200 

CCA TTG GAT 
Pro Leu Asp 



CCT GAG CTT 
Pro Giu Leu 



CAC CAA TI'C 
His Gin Phe 
250 

GGT TTA A.^T 
Giv Leu Asn 
265 

CCG ATT AAG 

Pro lie Asn 
280 

ACG TAT TTG 

Thr Tyr Lou 



ATG AGT GAA 
Met Ser Giu 



AGT ATT GGA 

Ser He Giy 
330 

CAG AAT CTC 

Gin Asn Leu 
34 5 

AAG GAG TGT 

Lys Glu Cys 
360 



TTC GAT CAC 
Phe Asp His 
125 

GTT AGA CCG 
Vai Ara Pro 
140 

ATC AGT GTA 
lie Ser Vai 
155 

AAT GAA GCA 
Asn Giu Aia 



CTC CC-A TAT 
Leu Arq Tyr 



TTG CCA GGG 
Lou Pro Giy 
205 

TAT GGT ATC 
Tyr Giv He 
220 



Giy Pro Aia 
235 

GCA GAG TTC 
Ala Giu Phe 



TCA GCG GTC 
Ser Ala Vai 



GAG CCA GTG 
Glu Pro Vai 
285 

GAA CAT AAC 
Giu His Asn 
300 

GAC ATA TTC 
Asp lie Phe 
315 

GGA TTC GAC 
Giy Phe Asp 



AAG AAA CGG 
Lys Lys Arg 



GAG GAA TTA 
Giu Giu Leu 
365 



GGG 38 6 

Giy 



GTT 4 34 

Vai 



GCT 4 82 

Ala 



GTT 530 
Vai 



GTT 576 

Vai 

190 

TTC 626 
Phe 



CGG 67 q 

Arg 

TTA 7 22 

Leu 



ACA 7 7 0 

Thr 



CTG 818 

Leu 

270 

CAC 8 66 

His 



GAA 914 
Glu 



AGG 9 62 

Arg 



TTC 1010 
Phe 



GTC 1058. 

Vai 

350 

GGG HOG 
Giy 
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ATT CTT GTA GAC AGA GAT GAT CAA Gr;o ACG TTG CTT CAA ATC T7C ACA 11-0 

lie Leu Val Asp Arg Asp Asp Gin Giv Thr Leu Leu Gin Hp Ph^ -^hr 

370 375 380 



543 



AA.A CCA CTA GGT GAC AGG CCG ACG ATA TTT ATA GAG ATA A-^'C CAG AG^ -'^O-^ 
Lys Pro Leu Gly Asp Arg Pro Thr He Phe- lie Glu He He Gir A-o 
385 390 395 

GTA GGA TGC ATG ATG AAA GAT GAG GAA GGG AA.G GOT TAC CAG AG" ^'>^Q 
Val Gly Cys Met Met Lys Asp Giu Glu Gly Lys Ala Tyr G ^ Se- --iv 
400 405 410 * ^ 

GGA TGT GGT GGT TTT GCC AAA GGC A.AT TTC TCT GAG CTC TTr a:^" '^C 12 98 
Gly Cys Gly Gly Phe Ala Lys Gly Asn Phe 3er Glu Leu Ph-- l"*- ^er 

420 425 " ;30 

ATT GA.A GAA TAC GAA AAG ACT CTT GA.A GCC AAA CAG TTA GT-S G-^^:- 
He Glu Glu Tyr Giu Lys Thr Leu Glu Ala Lys Gin Leu Val Glv 
435 440 445 

TGAACAAGAA GAAGAACCAJV CTAAAGGATT GTGTAATTP.A. TGTAAAACTG TTTTATCTTA 14 03 

TCAA>iACA-AT GTATACAACA TCTCATTTAA AAACGAGATC AATCC 14 4 b 

(2) I M FORMAT I ON FOR SZQ ID NO : 3 : 

(i) liEQUF^NCF CiiARACTERISTIC:: : 

(A) LENGTH: 4 45 ainirjo acids 
(3) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE; protGLn 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Met Gly His Gin Asn Ala Ala Val Ger Glu Asn Gin Asn r_- ,-.-d Aso 

Gly Ala Ala Ser Ser Pro Gly Phe Lvs Leu Val Giv P^c ' v- ^he 

20 L>5 ' ";T- " 

Val Arg Lys Asn Pro Lys Ser Aso Lys Pi-e Lvs Val Lys /.i - i--^^ , 3 
35 40 ' * "45 ' ■' 

His He Giu Phe Trp Cys Gly Asp Ala Thr Asn Val Ala A-r Arq Phe 
^0 55 60 ' ■ 

Ser Trp Gly Leu Giy Met Arg Phe Ser Ala Lvs Ser Aso Le- S*=.r T-r 
65 70 75 - 

Gly Asn Met Val His Ala Ser Tyr Leu Leu Thr Ser Gly Glu Lou Ara 
35 90 95 

Phe Leu Phe Thr Aia Pro Tyr Ser Pro Ser Leii Ser Giv G'v Glu He 
100 105 110 

Lys Pro Thr Thr Thr Gly Ser lie Pro Ser Phe Asp His Gly Ser Cys 
115 120 125 

Arg Ser Phe Phe Sor Ser His Giy Leu Gly Val Ara Pro Val Ala I^e 
130 135 

Glu Val Glu Asp Aia Glu Ser Ala Phe Sor He Ser Val A'a Asn Glv 

155 ieo 
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Aia Pco Ser 5or Pro Pro lie Vai Leu Asn Olu Ala Val Thir ll-; 

165 I"-- 

Ala Giu Val Lys Leu Tyr Gly Asp VaJ Va.l Leu Arg Tyr Vai Ser Tyr 

180 1-S5 

Lvs Aia Glu Asp Thr Giu Lys Ser Giu Phe Leu Pro Cly Phe Giu Arg 
195 200 ^^05 

Val Giu ASD Aia Ser Ger Phe Pro Leu Asp Tyr Giy He Arq Arg Leu 
210 ' 215 220 

f^sp His Aia Vai Glv Asn Va] Pro Giu Lou Gi.y Pro Ala Leu Thr Tyr 
225 ^ 230 235 2^0 

Vai Aia Gly Phe Thr Gly Phe fiis Gin Phe Ala Giu Phe Thr Aia Asp 
245 250 255 

Asp Val Glv Thr Ala Glu Ser Gly Leu Asn Ser Aia Val Lou Aia Ser 
260 265 270 

Asn Aso Giu Met. Vai Leu Leu Pro lie Asn Giu Pro Vai 4is G'.y Thr 
275 280 '-^^ 

Lys Ara Lvs Ser Gin Tic Gin Thr Tyr Lou Glu Hi3 Asn Glu Gly Aia 
2 0 0* 295 ^ ' ^ ^ 

Giv Leu Gin Hi.s Leu Ala Leu MeK : Glu Asp Ho Phe Arq Tt.r Leu 
305 310 21b • 3zu 

Arg GLu Met Arc Lvs Arq Ser Ser iie Gly Gly Phe Asp Phe Met Pre 
" 335 



325 



V a 



Se- ^-o Pro Pro Thr Tvr Tvr Gin Asn Leu Lys Lys Arq Vai Giy Asp 
340 ' 345 350 

Vai Leu Ser Asp Aso Gin lie Lys Glu Gys Glu Giu Leu Giy lie Leu 
355 360 365 

al AsD Arq Asd Asp Gin Gly Thr Leu Leu Gin lie Phe Thr l.ys Pre 
370 ' ' 375 380 

^eu GW AsD Ara Pro Tl^r lie Phe i^.. Glu lie lie Gin Arq Vai Giy 
585 "* ' 390 '^^^0 

Cvs Met Met Lys Asp Giu Giu Giy Lys Aia Tyr Gin Ser Giy Giy Cys 
405 '110 

Gly Gi^- Phe Aia Lys Glv Asn Phe Ser Glu Leu Phe Lys Ser lie Giu 
420 ' 425 ^i30 

Giu Tvr Giu Lys Thr Leu Glu Aia Lys Gin Leu Val Giy 
- 435 'l^O . 

(2) INFORMATION FOR SEQ ID NO: 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 53 base pairs 

(B) TYPE: nucleic acid 
;C) 5TRAWDEDNESS : single 
(D) TOPOLOGY: linear 

(i±) MOLECULE TYPE: DMA (qenomlc) 
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(xi) SEQUENCE DESCRIPTIOM: SEQ ID NO : ^ : 
TATGTCCAAG TTCGTA.z^.GAA AGAATCCAPJ^ GTCTGATAA/. TTCA.AGG:TA AGC 5 '3 

(2) INFORMATION FOR SEQ ID NO: 5: 

(1) SEQUENCE CHARACTERISTIC^- 

(A) LENGTH: 51 base pa7rs 
(E) TYPE: nucleic acid 
(C) STRANDEDNESS: -ingle 
CD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 

GCTTAA-CCTT GAATTTATCA GACTTTGGAT TCTTTCTTAC GAACTTGGAC .\ 

(2) INFORMATIO;- FOR SEQ 10 MO : 6 : 

(!) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 392 amino acids 
(B; TYPE: amino acid 
tX: STRANDEDNESS: ^ ^ hq ' ^ 
(;:•;■ TOPOLOGY: linoar '~ ' 

(XX) MOLECULE TYPE: procexn 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

Thr Ser Tyr Ser Asp Lys Giy Glu Lys Pro Glu Arg Giy Ara Phe L^u 
^ 10 



51 



His Phe Hxs Ser Val Thr Phe Trp Va 1 Cly Asn Ala Lys Gin Ala Al. 
Ser Tyr Tyr Cys Ser Ly.s lie Gly Phe Glu Pro Leu Ala Tyr Lys Gly 



Leu Glu Thr Gly Ser Arg Glu Vai Vai Scr H.s Val Val Lys Gin Asp 

Lys lie val Phe V.l Phe Scr Ser AI. Leu Asn Pro Trp Asn Lys Giu 

Met Gly Asp His Leu Val Lys H.s Gly Asp Gly Val Lys Asp 1>. Axa 

90 ■ 95" 

Phe Glu val Glu Asp Cys Aso Tyr lie Val Gi:. Lys Ala Arc Glu Arg 

Gly Ala lie He Val Arg Glu Glu Val Cys Cys Ala Ala Aso Val Arg 

1 20 125 

Gly His His Thr Pro Leu Asp Arg Ala Arg Gin Val Tro Glu Glv Thr 

135 

Leu val Glu Lys Met Thr Phe Cys Leu Asp Ser Arg Pre Gin Pro Ser 

155 

Gin Thr Leu Leu His Arg Leu Leu Leu "Ser Lys Leu Pro Lvs Cys Gly 

170 ■ 175 

Leu Glu He lie Asp His He Val Giy Asn Gin Pro Asp Gin Glu Met 

185 190 
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Glu Ser Ala Ser Gin Trp Tyr Met Arg Asn Leu Gin Phe His Arg Phe 
195 200 205 

Trp Ser Vai Asp Asp Thr Gin lie His Thr Giu Tyr 3or Ala Leu Arg 
2i0 2i5 220 

Ser Vai Vai Met Aia Asn Tyr Glu Giu Ser lie Lys Kez Pro lie Asn 
225 230 " 235 240 

Giu Pro Aia Pro Giy Lys Lys Lvs Ser Gin lie Gin Glu Tyr Vai Asp 
2^5 250 255 

Tyr Asn Giy Giy Aia Giy Vai Gin His lie Aia Leu Lys Thr Giu Asp 
260 ' 265 270 

lie lie Thr Aia lie Arg Ser Leu Arg Giu Arg Giy Vai Glu Phe Leu 
275 280 2£5 

Aia Vai Pro Phe Thr Tyr Tyr Lys Gin Leu Gin Glu Lys Leu Lys Ser 
290 295 300 

Aia Lys lie Arg V^i Lys Giu Ser lie Asp Vn 1 L-.^u Gl.: Glu Leu Lys 
305 310 315 320 

He Leu Vai Asd Tvr Aso Glu Lvs Giy Tyr Leu Leu Glr. lie Phe Thr 
325 ' " 330 335 

Lvs Pro Met Gin Aso Arc Pro Thr Vai Phe Leu Glu Vai lie Gin Arg 
340 ' " 345 350 

Asn Asn His Gin Giy Phe Giv Aia Giy Asn Phe Asn Ser Lou Phe Lvs 
355 ' 360 365 

Aia Phe Giu Glu Glu Gin Giu Leu Arg Giy Asn Leu Thr Asp Thr Asp 
370 375 3SG 

Pro Asn Giy Vai Pro Phe Arg Leu 
385 390 

(2) IMFORMATION FOR SEQ ID NO : 7 : 

SEQUENCE CHARACTERISTICS: 

{A) LENGTH: 392 amino acids 

{B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

Thr Ser Tvr Ser Asp Lys Giy Glu Lys Pro Glu Arg Giy Arg Phe Leu 
1 ' 5 10 15 

His Phe His Ser Vai Thr Phe Trp Vai Giy Asn Ala Lys Gin Aia Aia 
20 25 30 

Ser Tyr Tyr Cys Ser Lys lie Giy Phe Giu Pro Leu Ala Tyr Lys Giy 
35 40 45 

Leu Glu Thr Giy Ser Arg Giu Vai Vai Ser His Vai Vai Lys Gin Asp 
50 55 . 60 

Lvs lie Vai Phe Vai Phe Ser Ser Aia Leu Asn Pro Trp Asn Lys Giu 
65 - 70 75 80 
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Met Ciy Asp His Leu Val Lys His Giy Asp CJ.y Val Lvs Asp IIo Ala 
35 " 90" * " 95 

Phe Glu Vai Giu Aso Cys Asc Tvr lie Val Gin Lvs Aia Arq Giu Arq 
100 105 lie 

Gly Ala lie lie Vol Arc Glu Giu Vai Cys C\-e Ala Ala Aso Va " A^q 
11- 120 ' 125 * 

Giy His His Thr Pro Leu Asp Arq Ala Arg Gin Val Tro Giu Gl\- Thr 
130 135 140 

Leu Vai Giu Lys Met. Th.r Phe Cy:;- Leu Asp Ser Aro Pro Gin tro Ser 
l^J^ 150 155 ^ 160 

Gin Thr Leu Leu His ArL7 Leu Leu Leu Ser Lys Leu Pro Lvs Cvs Giv 
lo5 170 ' ' 17 5 

Leu Glu lie lie Asp His lie Val Glv Asn Gl-^^ Pro A-^p Gin GJu 
18C • IS 5 ' 190 

Giu Ser Ala Ser Gin Trp Tyr yiet Aru Asn Leu Gin Phf- -ii^^ Ar-j F'-^e 
1^5 :-C0 ' 205 

Trp Scr Vai Asp Asp Thr G 1 r; .: - Hi=; T-r Giu 'J'vr S^i Ala L-u Ara 
210 Li 5 220 

Ser Val Val Met Ala A::n Tyr Glu Glu bor lie Lys Met Pro 11.1; Asn 

2 30 2 35 24 0 

Glu Pro Ala Pro Gly Lys Lys Lys Ser Gin lie Gin Giu Tvr Vai Aso 
2 4 5 2 50 2 5 5' 

Tyr Asn Gly Gly Ala Giy Vai Gin His lie Ala Leu Lys Thr Glu Asp 
260 265 ' 270 

lie lie Thr Ala Il.e Arq Ser Leu Aro Giu Arq Giy Val Giu Phe Leu 
275 280 ■ 285 

Ala Vai Pro Phe Thr Tyr Tyr Lvs Gin Leu Gin Giu Lys Lou Lys Ser 
290 295 300 

Ala Lys lie Arq Val Lys Giu :;er lie Asn V.,! Lou Glu Glu Leu Lvs 
305 310 315 320 

lie Leu Val Asp Tyr Asp Giu Lvs Glv Tvr Leu Leu Gin lie Phe Thr 
325 330 335 

Lys Pro Met Gin Asp Arq Pro Thr Vai Phe Leu Giu Vai lie Gin Arg 
340 345 350 

Asn Asn His Gin Giy Phe Giy Ala Giv Asn P\\e Asn Ser Leu Phe Lys 
355 360 ' 365 

Ala Phe Glu Glu Glu Gin Glu Leu Arq Glv Asn Leu Th.r Asp Th^ Asp 
370 375 ' 380 

Pro Asn Gly Vai Pro Phe Arg Leu 
385 390 

(2) INFORMATION FOR GEQ ID NO : 8 : 

(i) SEQUENGE CHARACTERISTICS: 

(A) LENGTH: 392 amino acids 

(B) TYPE: amino acid 
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CO GTEIANDEDNES^J : single 
(D) TOPOLOGY: linear 

Hi, MOLECULE TYPE: protein 

(XI? SEQUENCE DESCRI rTTOr-: : 5EQ ID NO : 8 : 

Thr Thr Tyr Asn Asn Lvs Glv L^ro Lv's l=ro Glu Arq Glv Arq Phe Leu 
1 5 ' ' 10 15 

His Phe His Ser Val Thr Phe Trp Val Gly Asn Ala Lys Gin Ala Ala 
20 25 30 

Ser f/he Tyr Cvs Asn Lv:5 Meu Glv Phe Glu L-rc Leu Ala Tyr Arq C-y 
3b ' ' 4 0 

Leu Glu Thr Gly Ser Arq Glu V<il val S^-r ii^s Val lie Lys Arq Gly 
SO 5 5 GO 

Lys lie Val Phe Val Leu Cys Ser Ala Leu Asn Pro Trp Asn Lys Glu 
65 70 75 80 

Met Gly Asp His Leu Val Lys liis Gly Asp G i. y Val Lys Asp lie Ale 
8 5 " ' j 0 9 5 

Phe Glu Val Glu Asp C'/s As:::- His lie Va !. Gin Lys Ala Ara Glu Ar-:: 
100 ' ' " 105 110 

Gly Ala Lvs lie Val Arq Glu Pro Trp Val Glu Gin Asp I.ys Phe Gly 
115 12C 125 

Lys Val Lys Phe Ala Val Leu Gin Thr Tyr Gly Asp Tnr Thr His Thr 
130 135 140 

Leu Val Glu Lys He Asn Tvr Thr Gly Arq Phe Leu Fro Gly Phe Glu 
145 150 155 160 

Ala Pro Thr Tvr Lvs Asp Thr Leu Leu Pro Lvs Leu Pro Ara Cys Asn 
165 170 175 

Leu Glu lie He Aso His lie Val Glv Asn Gin Pro Asp Gin Glu Met 

180 * 1-^^^: 190 

Gin Ser Ala Ser Glu Trc Tvr Leu Lvs Asn i-eu Gin Phe Wis Arc Phe 
195 ' ' 200 * 205 

Trp Ger Val Asp Asp Thr Gin Val His Thr Glu Tyr Ser Ser Leu Ara 
210 ' 215 220 

Ser He Val Val Thr Asn Tyr Glu Glu Ser ile Lys Met Pro He Asn 
22 5 2 30 2 35 2 40 

Glu Pro Ala Pro Gly Arq Lys Lys Ser Gin Ho Gin Glu Tyr Val Asp 
245 250 255 

Tyr Asn Gly Glv Ala Gly Val Gin His lie Ala Leu Lys Thr Glu Asp 
260 265 270 

He lie Thr Ala lie Ara His Leu Arg Glu Arq Gly Thr Glu Phe Leu 
275 " 280 235 

Ala Ala Pro Ser Ser Tyr Tyr Lys Leu Leu Arq Glu Asn Leu Lys Ser 
290 295 300 

Ala Lys lie Gin Val Lys Glu Ser Met Asp Val Lou Glu Glu Leu His 
30 5 310 315 32 0 
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lie Leu Vai Asp Tyr Asp Glu Lvs Gly Tyr Leu Leu Gin IIg Phe T^r 
-25 330 335 

Lys Pro Met Gin Asp Arg Pre Thr Leu Fhe Leu Glu Vai lie GJn A-q 
340 345 

His Asn His Gin Gly The Giy Ala Gly Asn Phe Asn Ser Leu Pho Lv--^ 
355 360 365 

Ala Phe Giu Glu Glu Gin Aia Leu Arg Giv Asn Leu Thr Asp Leu G^u 
370 375 ' 380 

Pro Asn Giy Vai Arq Ser Giv Mcrj 
385. 390 

(2) INFORMATIOM FOR SZQ TD NO : 9 : 

(i) 5::QUENCE CHAPv^ACTERISTICS : 

(A) LENGTH: 376 amino acids 

(B) TYPE: ar.mo acid 

;C; i^iTRANDEDMESS : smale 
iZ) TOPOLOGY: linear 

(ii) XJLECULE type:: proteir: 

(xi) ::ZQUENCE DESCRIPTION; r.EQ ID NO : 9 : 

Tyr Trp Asp Lys Gly Pro Lys Pro Glu Arcr Gly Arc Phe Leu Uxs Phe 
•1 5 10 ' ^5 

His Ser Vai Thr Phe Trp Vai Giy Asn Ala Lys Gin Ala Ala Ser Phe 
20 25 30 

Tyr Cys Asn Lys Met Giy Phe Glu Pro Leu Ala Tyr Lvs Giv Leu G^u 
35 AO 45 " 

Thr Giy Ser Arq Giu Vai Vr.l Ger His Vai Ii- Lys Gin Gly Lvs lie 
50 55 60 

Vai Phe Vai Leu Cys Ser Ala Leu Asn Pro Tro Asn Lvs Giu Met Glv 
^5 70 75' ' so' 

Asp His Leu Vai ^ys His GXy Asp Giy Vai Lvs Asp Ilo Ala Phe Glu 
95 90 95 

Vai Giu Asp Cys Giu His lie Vai Gin Lys Ala Ara Glu Arg Gly Ala 
100 105 110 

Lys lie Vai Arc Giu Pro Tro Vai Glu Glu Asp Lys Phe Glv Lys Va > 
115 120 125 

Lys Phe Ala Vai Leu Gin Thr Tyr Giy Asp Thr Thr His Thr Leu Vai 
130 135 

Giu Lys lie Asn Tyr Thr Giy Arg Phe Leu Pro Giy Phe Glu Ala ^^r-c 

150 155 160 

Thr Tyr Lys Asp Thr Leu Leu Pro Lys Leu Pro Ser Cys Asn Leu Glu 
165 170 175 

He lie Asp His He Vai Giy Asn Gin Pro Asp Gin Glu Met Giu S^r 
180 185 

Ala Ser Glu Trp Tyr Leu Lys Asn Leu Gin Phe His Ara Phe Trp Ser 
195 200 2C5 
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Vai Asp Asp Thr Gin Vai His Thr Giu Tyr Ser Ser Leu Arg Ser lie 

210 215 220 

Vai Vai Ala Asn Tyr Giu Giu Ser lie Lys Met Pro lie Asn Giu Pro 

225 23C 235 240 

^la Pro Glv Arg Lys Lvs Ser Gin lie Gin Giu Tyr Vai Asp Tyr Asn 

245 ^ 250 255 

G^v Glv Ala Glv Vai Gin His lie Ala Leu Arg Thr Giu Asp lie lie 

260 265 270 

--r Thr I^e Arc His Leu Arg Giu Arg Cly Met Giu Phe Leu Ala Vai 

275 " 280 285 

Pro Ser Ser Tyr Tyr Arq Leu Leu Arg Giu Asn Lou Lys Thr Ser Lys 

290 295 



:ie Gin Vai Lys Giu Asn Met Asp Vai Leu Giu Giu Leu Lys 11 



e Leu 

10 ^ 315 320 



Va^ Asp Tvr Aso Giu Lvs Glv Tyr Leu Leu Gin lie Phe Thr Lys Pro 
32 5 ' 3 30 3 35 

G'ii A=o A-q Pro Thr Leu Phe Leu Gl\:; Vai lie Gl'i Arg Asn 
340 345 350 

■-■^s Gin G2V Phe Giy Ala Gly Asn Phe Asn Ser Leu Phe Lys Ala Phe 
355 360 365 

G^u Giu Giu Gin Ala Leu Arg Gly 
370 375 

{2; INFORMATION FOR SEO ID NO: 10: 

{!) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: i"6b base pairs 

(B) TYPE: nucleic acid 

(C) STPANDEDNESS : single 

(D) TOPOLOGY: linear 

{ii; MOLECULE TYPE: cDNA to mRNA 

(ir-) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

{vi) ORIGINAL SOURCE: 

(A) ORGANISM: Zea mays 

(rx^ FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 261.. 1595 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

ACTAGTTGTG AGAGCCTTCT GCGTTGGCAA TTGGCAGTAC AAGACAAATC ACATCCGCAA 60 

CCGC.-.ACCAC AGA.ATCGTCC GTCCACGTGG CCCCCATCAC TTCCCTTTAT TTACCAGTCG 120 

TCCCCCATCC CCAGGGCCAC CCACCAACAA GTGCAGTCAC CCGAGCCGCA AACTGCAGCT 180 

CTGCA.AGCTA CAGAGGCCAC CACGAGTCCA CGACGCCACG CCCTCCGAGA GAAAGAGAAA 2 40 
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GAG^CCA .^AGCACGATA ATG CCC CCG ACC CCC ACA GCC GCC GCA G^C 

Met Pro Pro Thr Pro Thr Aia Al.i Ala Ala 

s?5 tii m m J- s« s= 

20 -r, *^ ' 

£ 

CTC GTG GGC CAC CGC T^^^C TTC GTC r^C TT^ A^^r --r rrr -rr r-..^ 
Leu Vai Gl, H.s Arg Asn P.e Val Ar, ^In Pro Ar^ ^ Sp Ar^ 

TTC CAC ACG CTC GCG TTC CAC CAC GTG GAG CT-- — r r-r- r-r-r- -n.^ 
P.e H.s Tnr Leu Ala PHe Hi. H.s V^i' clu ^Jc-u ^rp cfs Sa As^ S 

G'_C TCC GCC GCG GGC CGC TTC TC'" TTr nc^C CTr rr^ r--.- 

Ala s„ o,, P„ ^l- CCC 

65 7Q 

GCA CGC TCC GAC CTC TCC ACG GGC A.AC TCC GCG CAr- GCG Trr r-rr 
A.o Hrg Ser Asp Lou Ser Thr Gly Asn Ser Aiu Hi^ A^^ S^r I^u L^u 

^ 9 0 

CTC CGC TCC GGC TCC CTC TCC TTC rTC TT^ AC'" rr- ■ --^r^ 

L.u Arg Ser Gly Ser Leu .or Ph. Leu Phe Th^ Ala P^c T^r aI^ 
■ 100 20S 

GGC GCC GAC GCT GCC ACC GCC GCG CTG CCC T^C T"^^ TCC G— — -r- 
Gly Ala ASP Ala Ala Thr Ala Ala Leu Pro Ser pAc Ser Ala^ A^a^ f,l 

GCG CGG CGC TTC GCA GCC GAC CAC GGC C-r GCG G-r; r-r rc- -T- — 
Ala Arg Arg Phe Ala Ala Asp His Gly L.. £ Vai^ ^ A^a vll Xla 



)30 



67 4 



130 

CTC CGC GTC GCC GAC GCC GAG GAC GCC TTC CGC GC^" AG- GTC r.-- rrr 

Leu Arg Vai Ala Asp Ala Giu Asp Ala Phe Arc Xla Se;: vIJ aI^ A^ 

1 5 0 



8 18 



GG.- CGC CCG GCG TTC GGC CCC GTC GAC CTC GGC CGC GGC TT." CG- 

o.. ...a Arg ,>ro A. a Phc Gly Pro Vai A.p Le. Glv Acu Glv J:': Arc ' 

1Gb - ■ - ■- .^^ 

aS g?u Sir rf P'^ V'"" ''^^ ^-^'^^ '^'^ CGG TAC GTG AGC 

Le. Axa Giu Vai Glu Leu Tyr Gly Asp Vai Vai Leu Arg Tyr Vol Ser 

^ oro ASP G?S A?^ A?"" r?^ ^^^^^ -^^^ G=C 

•V- .ro Asp Gly Ala Ala Gly Glu Pro Phe Leu Pro Gly Phe Glu Gly 

155 200 

vl^' A?f o^^ '^^^ ^"^^ '^AC GGG CTG AGC AGG TTC GAC CAC 

Va. Ala ser Pro Gly Ala Ala Asp Tyr Gly Leu Ser Arg Asp 

210 215 

ATC GTC GGC AAC GTG CCG GAG CTG GCG CCC GCC GCC GCC TAG TTr- rrr 

lie Val Gly Asn Vai Pro Glu Leu Ala Pro All ^^a Sa ^y^ 

2 30 

m"- Phf r'f'^ c:aG TTC ACG ACG GAG GAC GTG 

-1. Phe Thr Gly Phe His Glu Phe Ala Glu Pho Thr Thr Clu Aso Sli 

245 ■ 250 



066 



91-3 



962 



1 01 Q 



46 



wo 97/49816 PCT/US97/1 1295 

GGC ACC GCG GAG AGC GGC CTC AAC TCC ATG GTG CTC GCC P-AC AAC TCG 1058 
'Jlv Ala Glu Ser Giv Leu Asn Ser Moc Vai Leu Ala Asn Asn Ser 

' 253 ' ^'<^0 265 

GAG AA.C GTG CTG CTC CCG CTC .AAC GAG CCG GTG CAC GGC ACC r\AG CGC HOG 
"lu Asn Val Leu Leu Pro Leu Asn Glu Pre Vcji Wis Giy Tr.r Lys Arq 

270 -"'^ 

CGC AGC CAG ATA CAA ACG TTC CTG GAC CAC CAC GGC GGC CCC GGC GTG 1154 
Arq Ser Gin He Gin Thr Phe Leu Asp His His Glv Gly Pro Gly Vai 
235 290 295 

CAG CAC ATG GCG CTG GCC AGC GAC GAC GTG CTC AGG ACG CTG AGG GAG 1202 
Gin His Met Ala Lou Ala Ser Asp Asp Val Leu Arq Thr Leu Arg Glu 
300 305 310 

f^^G CAG GCG CGC TCG GCC ATG GGC GGC TTC GAG TTC ATG GCG CCT CCC 12 50 
Met Gin Ala Arg Ser Ala Met Gly Gly Pho Glu Phe Met Ala Pro Pro 
3 1 S 32 0 3 2 5 0 

ACA TCC GAC TAC TAT GAC GGC GTG AGG CGG CGC GCC GGG GAC GTG CTC 12 9S 
Thr Ser Asp Tv- Tvr Asc Gly Val Arq Arc Aru Ala Gly Asp Val Leu 
' 335 " 340 345 

-'-G GAA GCA CAG ATT A.^.G GAG TGC CAG GAG CTA GGG GTG CTG GTG GAC 
:'^r Glu Ala Gin lie Lvs Glu Cvs Gin Glu Leu oly Val Lou Val Asp 
350 355 360 

^\GG G^^7 GAC CAG GGC GTG CTG CTC C/\A ATC TTC ACC .^AG CCA GTG GGG 139 'i 
Arq Asp Asp Gin Glv Vai Leu Leu Gin lie Phe Thr Lys Pre Va. Gly 

GAC AGG CCA ACG CTG TTC TTG GAA ATC ATC CAA AGG ATC GGC TGC ATG 14 4 2 
Asp Arq Pro Thr Leu Phe Leu Glu lie lie Gin Arg lie Giy Cys Met 
380 385 390 

GAG AAG GAT GAG AAG GGG CA„A GA.A TAC C/.A AAG GGT GGC TGC GGC GGG 14 90 
Glu Lys AsD Glu Lys Giy Gin Glu Tyr Gin Lys Gly Giy Cys Gly '-iy 
395 400 405 ^510 

TTC GGC AAG GGA A.AC TTC TCG CAG CTG TTC AAG TCC ATC GAG GAT TAT 15 38 
^he Glv Lys Glv Asn Phe Ser Gin Leu Phe Lys Sei: lie Glu Aso Tyr 
^ .lis 4 20 4 25 

"AG AAG TCC CTT GAA GCC AAG CA.^. GCT GCT GCA GCA GCT GCA GCT CAG 158 6 
Glu Ly^ Leu Glu Ala Lys Gin Ala Ala Ala Ala Ala Ala Ala Gin 

430 435 440 

GGA TCC TAG GACAGTGCTT GGAGACGAGC AACTGCTGTG GCACTTTGTA 163 5 
Gly Ser 

TCATGGAACA GAAATAATGA AGCGTGTTCT TTGTGACACT TGACATGCAA ATGTTTGTGT 1695 

TCTGTAACCG TTGAATATAT GGGACGATGC TATGATGGTG TAATAGATGG TAGAGAGGGT 17 55 

ACAACCCTGA T ^"^^^ 
{2) INFORMATION FOR 5£Q ID NO : 1 1 : 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 445 amino acids 
(E) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : i 1 ; 
Met Pro Pro Thr Pro Thr Ai6 Aia Aia Ala Glv Aia Ala Va ' Ala A' a 

Aia Ser Ala Ala Glu Gin Ala Ala Phe Arg Lou vai Glv Hi'i A-a A^n 
20 25 • 30 ' ' ■ ^ 

Phe Val Arg Phe Asn Pro Arg Ser Asp Arg Phe His Thr Leu Ala Phe 
35 40 45 

His His Vai Glu Le-j Trp Cys Aia Asp Aia Aia Ser Aia Aia Gly Arq 

^'^ 55 60 

Phe Ser r-he Gly Leu Gly Ala Pro Leu Aia Aia Aro Ser Aso Leu Ge^- 

70 75 • 80 

Thr Gly Asn Ser Ala Ills Ala Ser Leu Lou Leu Arg Ser c: v Ser Leu 

90 ' 95 

Ser Phe Leu Phe Thr Ala Pro Tyr Ala His Gly Ala Asd Aia AJ a -hr 
100 105 "no 

AX 3 Ai(j Leu P^'o Sh^^ Phf^ 'if.-. >- I 7i A 1 -> r\ ' 

, ^ ^ - ^ ' Aia Axa A^a /\r:.; /\rn Phe Aia Aia 

^-^ 120 125 

Asp His Giy Leu Aia Vai Arg Aia Vai Aia Leu Aro Vai A^^ Aia 
130 



14 0 



Giu Asp Ala Phe Arg Ala Ser Vai Ai. Aia Gly Aia A::q Pro Aia Phe 
^ 150 

Giy Pro Vai Asp Leu Gly Arq Gly Phe Arg Le.u Aia Glu Vai Giu Leu 

170 175 

Tyr Gly Asp Val Val Lou Arg Tyr Vai Ser Tvr Pro Asd Glv A^'a Ala 

^80 iss ^ ^ 2 90 ' ^ 

Gly Glu Pro Phe Leu Pro Gly Phe Glu Giy Val Ala Ser Pro Glv Ala 

200 205 

Ala Asp Tyr Glv Leu Ser Arc Phe Asp His lie Val Glv Asn Val ^ro 

10 215 " — 



220 



Glu Leu Aia Pro Ala Aia Aia Tyr Phe Aia Gly Phe Thr Glv Phe His 

'^'^^ 230 235 ' 240 

Giu Phe Aia Glu Phe Thr Thr Glu Asp Val Gly Thr Ala Glu Ser Glv 

245 250 255 

Leu Asn Ser Met Vai Leu Ala Asn Asn Ser Giu Asn Val Leu L-u Pro 

260 265 270 

Leu Asn Glu Pro Vai His Giy Thr Lys Arg Arc Ser Gin He Gin Thr 

2'5 280 285 

Phe Leu Asp His His Gly Giy Pro Gly Val Gin His Met Ala Leu Ala 

290 295 



300 



Ser Asp Asp Vai Leu Arg Thr Leu Arg Glu Met Gin Ala Ara Sor Aia 

310 315 . ' 32f, 

Met Giy Gly Phe Glu Phe Met Ala Pro Pro Thr Ser Asp Tv T'-r Asd 
325 330 335 
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Giv Val Ara Arq Arq Ala Gly Asp Va 1 Leu Thr Giu Ala Gin lie Lys 
' 340 3.1 3 350 

Giu Cys Gin Giu Leu Civ Vai Leu Val Asp Arg Asp Asp Gin Gly Val 
355 360 365 

Leu Leu Gin lie Phe Thr Lys Pro Val Gly Asp Arq Fro Thr Leu Phe 
370 375 330 

Leu Giu lie lie Gin Arg lie Gly Cys Met Giu Lys Asp Giu Lys Giy 
385 390 395 400 

Gin Giu Tyr Gin Lvs Giy Giy Cys Giy Gly Phe Giy Lys Giy Asn Phe 
405 ^10 415 

S^r Gin Leu Phe Lys Ser lie Giu Asp Tyr Giu Lys Ser Leu Giu Aia 
420 425 430 

Lys Gin Ala Aia Ala Ala Aia Aia Aia Gin Giy Ser 
435 440 

(2) INTORMATICN FOR SEQ ID NO: 12: 

(i- SEQUENCE CHARACTERISTIC^:- : 

{A) LENGTH: 1356 base pair:-; 
( B ) TYPE: r. uc i e 1 -J ^-10 
{O STRANDEDNESS : aoubie 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA tc mRNA 

(iJLX) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 

(A) OP.GANISM: Arabidopsis t ha liana 

(ixj FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION : 1 . . 1254 

!ix) FEATURE: 

(A) NAT-IE/KEY: misc^feature 
(Bl LOCATION : 1 . . 3* 

(D) OTHER INFORMATION: / s t andard_name= 

"transiacion initiation 
codon" 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 
(BJ LOCATION: 1252.. 1254 

{0) OTHER INFORMATION: s t anda rd_name = 

'* t ransiat ion termination 
CO don" 

{XL) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

-ATG TCC .AAG TTC GTA AGA AAG AAT CCA AAG TCT GAT A.AA TTC AAG GTT 4 8 

Met Ser Lys Phe Vai Arg Lys Asn Pro Lys Ser Asp Lys Phe Lys Vai 
1-5 10 15 

AAG CGC TTC CAT CAC ATC GAG TTC TGG TGC GGC GAC GCA ACC AAC GTC 9 6 

Lvs'A^q Phe His His lie Giu Phe Trp Cys Gly Asp Aia Thr Asn Vai 
20 25 30 
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..-^ ^jirt UMU (jrtl- (.=CG TCG TCG T7C CCA T'^G GAT cr"'- 

= iy Phe Glu Arg Val Glu Asp Aia Ger Scr Phe Pro Leu Asd ■^'■■'^ Giv 
180 165 > 



;i5 220 



ol ^'r'' ^r'^ ^'^^ ^'^'^ ^'^A ACC GCC GAG AGC GOT TTA A^Vf TCA GCG 

Phe Tnr Ala Asp Asp Val Gly Thr Al. Gl. Sor Gly Leu Asn llr A^a 

230 235 240 

GTC CTG GCT AGC AAT GAT GAA ATG GTT CTT OTA CCG ATT AAC GAG CCA 

-^ol Leu Ala Ser Asn Asp Glu Met Val Leu Leu Fro lie Asn G^.- Pro 

245 250 255 

GTG CAC GGA ACA AAG AGG A.^G AGT GAG ATT CAG ACG TAT TTG GAA CAT 

vai .-.is Gly Thr Lys Arq Lys Ser Gin lie Gin Thr Tyr Leu Glu ills 

260 265 270 



Tsn r^t r? ^^"^ '^^'^ ""^^ ^'^'^ '^-T CAA GAC ATA 

Asn Glu Gly Ala Gly Leu Cln His L«u Ala Leu Met Ser Glu Asp lie 
-^5 280 285 



240 



258 



■■2 6 
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GCT CGT CGC TTC TCC TGG GGT CTG GGG ATG AGA TTC Tcr- gC-" AAA T.— 
Ala Arg Arg Phe Ser Trp Gly Leu Gly Met Ara Phe Se^ aI: K S^r 
J 5 .10 4 5 

GAT CTT TCC ACC GGA A-AC ATG GTT CAC GCC TCT TAG CTA CTG ACC TCr ig-^ 
Asp .eu Car Thr Gly Asn Met Val His Ala Ser Tvr Leu Leu Thr Sei' 

S5 ^,0 

GGT GAC GTC CGA TTC CTT TTC ACT GCT CCT TAC TCT CCG T^T CT^ T-- 
•^xy Asp Leu Arg Phe Leu Phe Thr Ala Pro Tvr Ser Pro Ser L-c Ser 

70 75 - go 

CCC GGA GAG ATT AAA CCG ACA ACC ACA GCT TCT ATC CCA AGT T-r gat 
Ala b.y Giu He Lys Pro Thr Thr Thr Aia Ser ll- fr- Fhp r^o- 

S5 90 95 ■ ""^ 

CAC GCC TCT TGT CGT TCC TTC TTC TCT TCA CAT GG^ PT'- GGt C-- — - 
Hxs Gly ser Cys Arg Ser Phe .he Ser Ser Mi. Gly lIu ^1:. Va L Ar^ 
100 105 ii'o 

GCC GTT GCG ATT GAA GTA G.AA GAC GCA GAG TCA GCT TTC TCC ATC PGT 
Ala Val Ala He Glu Val Giu Asp Ala Glu Ser Ala Phe Ser lie Ser 

120 12 5 

CTA GCT AAT GCC CCT ATT CCT TCG TCG CCT CCT ATC GTC CTC gaA 
val ...a «sn Gly Ala He Pro Ser Ser Pro Pro I J .... v., 1 ^.eu Asn Gb> 
^■^'^ 135 i^io 

'^T^ F;'^ •'^^^ '^'^'^ '^'"^C GGC GAT GTT GTT CTr ."ga 

Ala ..a. Thr He Ala Glu Val Lys Leu Tyr Gly Asp Val Val Leu Arg 

150 155 

TAT GTT AGT TAC AAA GCA G.AA. GAT ACC GAA A.V. TCr GAA TT" -^-r- rr-p 
iyr Vai Ser Tyr Lys Ala Glu Asp Thr Glu Lys Ser Glu Ph^ L^u Pro 
165 170 175 

n"^^. ^""^ ^?"- ^-'^ -C'- TTC CCA TTG GAT TAT GGT 

Asp 
190 

ATC CGG CGG CTT GAC CAC GCC GTG GGA AAC GTT CCT GAG CTT r:c'r rr-- 
i.e Ara Arg Leu Asp Hx.. Ala Val Gly As.. Val Pro Giu ill Pro 
^95 200 205 

GCT TTA ACT TAT GTA GCG GGG TTC ACT GGT TTT CAC C/A TTC GCA GAG 
Ala .eu Thr Tyr Val Ala Gly Phe Thr Gly Phe His Gin Ph^ Ala Glu ' 



528 



>7 6 



/20 



*6S 



816 



864 
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AGG ACC CTG AGA GAG ATG AGG AAG AGG AGC AGT ATT GGA GGA TTC .U - 

Phe Arg Thr Leu Arg Glu Met Arg Lys Arg 5er Ser lie Giv Giy Phe 
290 295 300 

GAC TTC ATG CCT TCT CCT CCG OCT ACT TAC TAG CAG A/-T CTC .A.^.G AAA 9 60 

Pso ^he Met Pro Ser Pro Pro Fro Thr Tyr Tyr Gin Asr. Lou Ly.^ Lys 
305 " 3i0 31b 320 

CGG GTC GGC GAC GTG CTC AGC GAT GAT CAG ATC ^.AG GAG TGT GAG GAA 1CC8 
Ara Vai Gly Asp Vai Lea Ser Asp Asp Gin lie Lys Glu Cys Glu Giu 
325 330 335 

TTA GGG ATT CTT GTA GAC AGA GAT GAT C.A.A GGG ACG TTC CTT CAA ATC 1056 
Leu Gly 11'- Leu Vai Asd Ara Asd Asp Gin Giy Thr Leu Leu Gin lie 
340 34 5 350 

TTC ACA A^\A CCA CTA GGT GAC AGG CCG ACG ATA TTT ATA GAG ATA ATC ilC4 
Phe Thr Lys Pro Leu Giy Asp Arq Pro Ttir lie Phe lie Giu lie lie 
355 360 365 

CAG AGA GTA GGA TGC ATG ATG AfiJK GAT GAG GAJ>^ GGG .A^G GCT TAC CAG 1152 
Gin Arq Vai Giv Cys Met Met Lys Aso Gi;: Giu Giy Lys Ala Tyr Gin 
* 570 ■ 375 " 3R0 

AGT GGA GGA TGT GGT GGT TTT GGC AAA GGC AAT TTC TCT GAG CTC TTC i:c:. 
Ser Giv Giv Cys Giv Giy Phe Giy Lys Giy Asn Phe Ser GVu Leu Phe 
385 ' ' ^ 39C 395 ^00 

AAG TCC ATT G/^^ GAA TAC GA-^ P-JkG ACT CTT GAA GCC AAA CAG TTA GTG 12^8 
Lys Ser lie Glu Glu Tvr Giu Lys Thr Leu Giu Ala Lys Gir> Leu Vai 
AOb ^liO 'lis 

GGA TGA AC-AAGAJ\GA.A GAACCAACTA AAaGGATTGTG ^J^.TTrJ^.TQT .AA.AA.CTGTTT 130m 
Giy * 

TATCTTATCA A.AACA-ATGTA TACA.ACATCT CATTT^^J^.A-A.^^ CGAGATCA.AT CC 13 56 

(2) INFORMATION FOR SZQ ID NO : i 3 : 

ii) sequence: CiiARACTERISTICS : 

{A) LENGT!;: 4 18 amino acids 
(E) TYPE: amino acid 
{Dl TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi": SEQUENCE DESCR I FT rC:i : 5EQ ID NO : i 3 : 

Met Ser Lvs Phe Vai Arg Lys Asn Pro Lys Ser Asp Lys Ph.e Lys Va ! 
1^5 10 15 

Lvs Ara Phe His His lie Giu Phe Trp Cys Giy Asp Ala Thr Asn Vai 
20 25 3C 

Ala Arg Arg Phe Ser Tro Giv Leu Giy Met Arg Phe Ser Ala Lys Ser 
35 40 45 

Asp Leu Ser Thr Gly Asn Met Vai His Ala Ser Tyr Leu Leu Thr Scr 
50 55 ^0 

Giy Asp L^u Arg Phe Leu Phe Thr Ala Pro Tyr Ser Pro Ser Leu Ser 

65 ' "70 ^0 

Ala Giv Glu He Lys Pre Thr Thr Thr Ala Ser He Pro Ser Phe Asp 
95 90 95 
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His Gly Ser Cys Arq Ser Phe Phe Ser 3er His Giv Leu GJ v Va ' ^ro 

Ala Val Ala He Glu V.U Glu Asp Ala Glu Ser Ala The Ser t e s^- 
115 120 125 

val Ala Asn Gly Ala He Pro Ser Ser Fro Pre He Leu A:.r. Glu 

135 140 

Ala Val Thr He Ala Glu Vai Lys leu Tyr Gly Asp Val Val Leu Arq 

150 .g^ 

Tyr Val Ser Tyr Lys Ala Glu Asp Thr Glu Lvs Scr Giu L"u Pro 

165 170 ]^-7c, 

Gly Phe Glu Arg Val Glu Aso Ala Ser Sor Phe Pro To,, p.n rt.- 
'■^^ .65 190 

He Arg Arg Leu Asp ills Ala Val Gly /.sn V,..l Pre G.u r.eu Gl-..- P^-. 
195 2 00 20 5 

Ala Leu Thr Tyr Val Ala Gly Phe Thr Gly Phe His Gin Phe Ala Glu 



Phe Thr Ala Asc Asn Val Gly Th.- Ala Glu Svr Giv I.ou A-, r, ■, , 

-■^^ 230 235 ■ 

Val Leu Aia Ser Asn Asp Glu Met Val Leu Leu Pro H A-n G'u ^-n 

245 250 2^5 

Val His Gly Thr Lys Arg Lys Ser Gia He Gin Thr Tyr L^u Gi.^ u;. 

260 265 - 270 " 

Asn Glu Gly Ala Gly Leu Gin His Leu Aia Leu Mot '^o^ G'- A-c H^ 
275 280 285 ^"^ " 

Phe Arg Thr Leu Arg Glu Met Arg Lys Arg Scr Ser He Giv Giv Ph- 

295 300 

Asp Phe Met Pro Ser Pro Pro Pro Thr Tyr Tyr Gin A.sn Leu Lvs Lys 

310 315 • 

Arg Val Gly As? Val L-su Ser Asp Ar.p Gin ll.^ Lvs Giu Cv. G>- G'u 
325 330 ■ 335 

Leu Gly He Leu Val Asp Arg Asp Asn Gin Giv Thr Leu Leu G'-r ^ le 
310 345 350 

Phe Thr Lys Pro Leu Gly Asp Arg Pro Thr He Pho He Glu H. ^ Tio 
355 360 3^5 

Gin Arg Val Giy Cys Met Met Lys Asp Glu Glu Giv Lys Aia Tyr Gin 

375 380 

Ser Gly Gly Cys Gly Gly Phe Gly Lys Gly Asn Phe Ser Glu Leu Phe 

390 395 

Lys ser lie Glu Glu Tyr Glu Lys Thr Leu Glu Ala Lvs Gin Leu Val 
405 410 - 415 

Gly ' 
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(2) INrOR:'LATION FOR 3EQ ID NO : 1 4 : 

{i; SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1443 base pairs 
(B; TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii; MOLECULE TYPE: cDNA zo mRNA 

(iii; HYPOTHETICAL: NO 

(Vi) ORIGINAL SOURCE: 

(A} ORGANISM: Arabidopsis tr^.aiiana 

iix) FEATURE; 

(A) NA^3E/KEY: CDS 

(B) LOCATION: 9. .134c 

(ix) FEATURE: 

(A) NAME/KEY: mi5C_f ea ture 

(B) LOCATION: 9. . 11 

(D) OTHER INFORMATION: / 3 1 anda r d_r.ame = 

" r r.^ins J at ion initiation 
codon " 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 
{3) LOCATION: 134 4.. 134 6 

iO) OTHER INFORMATION: / s t a nda rd_name= 

"translation termination 
codon" 

(xil SEQUENCE DESCRIPTION: 5EQ ID NO: 14: 

TGAAA.TCA ATG GGC CAC CAA AAC GCC GCC GTT TCA GAG AAT CAA AAC CAT 50 
Met Glv His Gin Asn Ala Ala Val Ser Glu Asn Gin Asn His 
1 5 10 

GAT GAC GGC GCT GCG TCG TCG CCG GGA TTC AAG CTC GTC GGA TTT TCC 98 
Asp Asp Gly Ala Ala Ser Ser Pro Glv Phe Lvs Leu Val Cly Phe Ser 
15 20 ' 21 30 

AAG TTC GTA AGA AAG AAT CCA AAG TCT GAT h/'J^^ TTC AA.G GTT AAG CGC 14 b 

Lys Phe Val Arc Lys Asn Pro Lys Ser Asp Lys Phe Lys Val Lys Arg 
35 4 0 * 45 

TTC CAT CAC ATC GAG TTC TGG TGC GGC GAC GCA ACC AAC GTC GCT CGT 194 
Phe His His lie Glu Phe Trp Cys Gly Aso Ala Thr Asn Val Ala Arg 
50 55 " 60 

CGC TTC TCC TGG GGT CTG GGG ATG AGA TTC TCC GCC AAA TCC GAT CTT 24 2 

Arg Phe Ser Trp Gly Lou Gly Met Arg Phe Ser Ala Lys Ser Asp Leu 
6 5 7 0 7 5 

TCC ACC GGA P^J\C ATG GTT CAC GCC TCT TAC CTA CTC ACC TCC GGT GAC 2 90 

Ser Thr Gly Asn Met Val His Ala Ser Tyr Leu Leu Thr Ser Gly Asp 
80 85 90 

CTC CGA TTC CTT TTC ACT GCT CCT TAC TCT CCG TCT CTC TCC GCC GGA 338 
Leu Arg Phe Leu Phe Thr Ala Pro Tyr Ser Pro Ser Leu Ser Ala Gly 
95 100 105 110 

GAG ATT AAA CCG ACA ACC AC A GCT TCT ATC CCA AGT TTC GAT CAC GGC 38 6 

Glu lie Lys Pro Thr Thr- Thr Ala Ser lie Pro Ser Phe Asn His Gly 
115 120 ' 125 
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TCT TGT CGT TCC TTC TTC TCT TCA CAT GGT CTC GGT GTT AGA GCC CTT 4 3-; 

Ser Cvs Arq Ser Phe Phe Ser Ser His Gly Leu GLy Va.l. Arq Aia Val 

130 13^:) 1.^6 

GCG ATT GA/\ GTA GA.A GAG GCA GAG TCA GCT TTC TCC ATC ACT GTA GC7 ^62 

Aia lie Giu Val Giu Asp Aia Glu Ser Aia ?he Ser lie Ser Val Aia 

14 5 150 155 

AAT GGC GCT ATT CCT TCC TCG CCT CCT ATC GTC CTC AAT GP'JK GCA GTT 530 

Asn Gly Ala lie Pro Ser Ser Pre Pre lie Val Leu Asn Glu Aia Val 

160 165 170 

ACG ATC GCT GAG GTT AAA CTA TAG GGC GAT GTT GTT CTC CGA TAT GTT 57 8 

Thr rie Ala Giu Val l.ys Leu Tyr Glv Asn V.'i 1 Vnl Lou Arq Tvr Val 

175 180 ^ * ' 185 ' ' 190 

AGT TAC J^J\A GCA G.AA GAT ACC GAA A/iA TCC GTVA TTC TTG CCA GGG TTC 62 6 

Ser Tyr Lys Aia Giu Asd Thr Giu Lvs Ser Glu Phe Leu Pro Glv Phe 

195 * ' 200 205 

GAG CGT GTA GAG GAT GCG TCG TCG TTC CCA TTG GAT TAT GGT ATC CGG 67^ 

Giu Arq Val Glu Asp Ala Ser Sc-r Phe Pro Leu Asp Tvr Glv Il:i- Arq 

210 ■ 215 220 

CGG CTT GAG CAC GCC GTG GGA AAC GTT CCT GAG CT^ GGT CCG GCT TTA 2 2 

Arq Leu Asp His Aia Val Gly Asn Val L'ro Glu Leu'Glv Pro Ai^: Leu 

225 230 235 

ACT TAT GTA GCG GGG TTC ACT GGT TTT CAC CAA TTC GCA GAG TTC ACA 77 0 

Thr Tyr Val Aia Gly Phe Thr Glv Phe His Gin Pho Aia Glu Phe Thr 

240 245 ' 250 

GCA GAC GAC GTT GGA ACC GCC GAG AGC GGT TTA AAT TCA GCG GTC CTG 8 IB 

Ala Asp Asp Val Gly Thr Aia Giu Ser Gly Leu Asn Ser Ala Val Leu 

255 260 265 270 

GCT AGC AAT GAT G.AA ATG GTT CTT CTA CCG ATT r-J\C GAG CCA GTG CAC 8 66 

Ala Ser Asn Asp Giu Me t Val Leu Leu Pro lie Asn Glu Pro Val ills 

275 2S0 265 

GGA AGA AAG AGG AAG AGT CAG ATT GAG ACG TAT TTG G^.A CAT A„AC GAA. 1 '1 

Gly Thr Lys Arq Lys Set Gin lie GJ r- Thr 'I'vi L'_'U Glu H^s As:: Giu 

290 ' 295 ' 300 

GGC GCA GGG CTA CAA CAT CTG GCT CTG ATG AGT G.AA GAC ATA TTG AGG 962 

Gly Ala Gly Leu Gin H^s Leu Ala Leu Met Scr Glu Asp lie Pho Arg 

305 310 315 

ACC CTG AGA GAG ATG AGG AAG AGG AGC AGT ATT GGA GGA TTC CAC TTC 1010 

Thr Leu Arg Glu Mec Arg Lys Arg Ser Ser lie Gly Gly Phe Asp Phe 

320 " 325 330 

ATG CCT TCT CCT CCG CCT ACT TAG TAC CAG AAT CTC AAG /iA/; CGG GTC ■ 1053 

Met Pro Ser Pro Pro Pro Thr Tyr Tyr Gin Asn Leu Lys Lys Arc Val 

335 340 345 ' ' 350 

GGC GAC GTG CTC AGC GAT GAT CAG ATC AAG GAG TGT GAG GAA TTA GGG 1106 

Gly AsD Val Leu Ser Asp Asp Gin lie Lvs Giu Cys Giu Glu Leu Gly 

355 360 365 

ATT CTT GTA GAC AGA GAT GAT CAA GGG ACG TTG CTT CAA ATC TTC ACA 1154 

lie Leu Val Asp Arg Asp Asp Gin Gly Thr Leu Leu Gin Tie Phe Thr 

370 ' 375 330 
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CCA CTA GGT GAC AGG CCG ACG ATA T™ ATA GAG ATA ATC CAG AGA 1202 
Lvs Pro Leu Giv Asd Arg Pro Thr II- Phe lie Glu lie He Gin Arq 
335 ' 390 395 

GTA GGA TCC ATG ATG AA.A GAT GAG G.A.^. GGG .A^^^G GCT TAC CAG AGT GGA 12 50 
Val Giy Cvs Met Met Lvs Asd Glu Giu Gly Lya Ala Tyr Gin Ser Giy 
400 ^ ' ^05 -^10 

GGA TGT GGT GGT TTT GGC AAA GGC AAT TTC TCT GAG CTC T7C PJ\G TCC 12 93 
Giv Cvs Giy Giy Phe Giy Lvs Giy Asn Phe Ser Oiu Leu Phe Lys Ser 
415 ' 420 ^25 430 

ATT G^a\ GAA. TAC GA^ AAG ACT CTT G.AA GCC .A^Ji CAG TTA GTG GGA TGA 134 6 
lie Giu Giu Tyr Giu Lvs Thr Leu Giu Aiu Lys Gin Leu Val Giy 
4 2 5 -14 0 4 4 5 

ACAAGAAG.A.^. GAACCA/vCTA AAGGATTGTG T^.^TTA.^.TGT T-J^CTCTTT TATCTTATCA 1 4 0 G 

.:^CAA.TGTA TACAACATCT CATTTPJ^.AA CGAGATCAA.T CC 14 4 8 

(2) IbJFORMATIOM FOR SZQ ID MO : 1 5 : 

(i) sequence: CHARACTERISTICS: 

(A) LENGTii: 44 6 air.ino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: procem 

(xi) SEQUENCE DESCRI PTICr: : 3EQ ID NO : 1 : 

Met Giy rUs Gin Asn Ala Aia Val Ser Giu Asn Gin Asn His Asp Asp 

1 5 10 1= 

Giy Aia Aia Ser Ser L>rc Gly Phe Lys Lou Val Giy Phe Ser Lys Phe 
20 .'-^ 

Val Arq Lys Asn Pro Lys Ser Asp Ly::^ L>he Lys Vai Lys Arq Phe His 
• 35 ^0 45 

His ^ie Giu Phe Tro Cvs Giv Asd Aia Thr Asn Vai Ala Arc Arq Ph.e 
50 55 ^0 

Ser Trp Giv Leu Giv Met Arg Phe Ser Ala Lys Ser Asp Leu Ser Thr 
65 ^ 7C 75 30 

Giv Asn Met Val His Aia Ser Tyr Leu Leu Thr Ser Giy Asp Leu Arq 
85 90 95 

^he Leu Phe Thr Aia Pro Tyr Ser Pro Ser Leu Ser Aia Gly Giu lie 
100 105 110 

Lys P-o Thr Thr Thr Aia Ser iie Pro Ser Phe Asp His Giy Ser Cys 
115 120 125 

Arq Ser ^he Phe Ser Ser His Gly Leu Giy Vai Arg Aia Vai Aia lie 
130 135 140 

Giu Vai Giu Asp Aia Giu Ser Ala Phe Ser lie Ser Val Aia Asn Giy 

145 150 15^ 

Aia lie Pro Ser Scr Pro Pro lie Vai Leu Asn Giu Aia Val Thr lie 

165 1^0 175 

Aia Giu Vai Lvs Lou Tvr Giv Asp Vai Vai Leu Arq Tyr Val Ser Tyr 
180 ^ 185 190 
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Lys Ala Giu Asp Thr Glu Lys Ser Glu Phe Leu Pro GIv Phe Giu Arg 
ly^D 200 ^ 



205 



Val Glu A3P Ala Ser Ser Phe Pro Le. Asp Tyr Giy He Arg Arg Leu 

Asp His Al. Val Gly Asn Val Pro CI. Lou Gly iro Ala Leu Thr Tyr 

-■^^ 240 

val Ala Gly Phe Thr Gly Phe His Gin Phe Ala Glu Phe Thr Ala Aso 

215 250 255 

ASP val Gly Thr Ala Glu Ser Gly Leu Asn Ser Ala Val Leu Ala Ser 
-^fu 265 270 

Asn Asp Glu Met Val Leu Leu Pro He Asn Glu Pro Val His Gly Thr 
■ 280 285 

Lys Arg Lys Ser Gin He Gin Thr Tyr Leu Glu Hxs Asn Glu Gly Ala 

295 300 

Gly Leu Gin Hi. Leu Ala Leu Met Ser Glu Asp He Phe Arg Thr Leu 



J 1 7j 



2 0 



Arg Glu Met Arg lys Arg Ser Sor He Gly Glv Phe A^-p P^.. Met Pro 

330 ^ 

Ser Pro Pro Pro Thr Tyr Tyr Gin Asn Leu Lys Lys Arg V.U Gly Asn 

345 

Val Leu ser Asp Asp Gin He Lys Glu Cys Giu Glu Leu Glv IX. Leu 

JD^ 360 



3 65 



Val Asp Arg Asp Asp Gin Gly Thr Leu Leu Gin lie Phe Thr Lys Pro 

375 

385 ^^"^ ""^^ ^^^^ P'^^ Giu lie lie Gin Arq Val Glv 

•^"^ Jb>0 '^'95 



4 OC 



Cys Met Met Lys Asp Giu Giu Giy Lys Ala Ty, cin Ser Glv Gly Cvs 

Air; 



^05 - 

Gly Gly Phe Giy Lys Gly Asn Phe Ger Giu L^u Phe Lys ^^r lie G" 
^20 425 ^30 

Glu Tyr Glu Lys Thr Leu Glu Ala Lys Gin Leu Val Gly • 
435 

{2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) .LENGTH: 513 base pairs 

(B) TYPE: nucleic acid 

(C) 3TRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 
(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Vernonia gaiamenensis 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: vs 1 . p kOO 1 5 . bl' 

56 
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(y.i) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

CCACACCGAT TGCCGGAACT TCACCGCCTC TCACGGCCTT GCAG7CCGAG CAATCGCCAT 60 

TGAAGTCGAT GACGCCGAAT TAGCTTTCTC CGTCAGCGTC TCTCACGGCG CTAAACCCTC 120 

CGCTGCTCCT GTAACCCTTG GA.AACAJ^CGA CGTCGTATTG TCTG.^GTTA AGCTTTACGG 180 

CGATGTCGCT TTCCGGTACA TAAGTTACA-^ AA.ATCCGAAC TATACATCTT CCTTTTTGCC 24 0 

CGGGTTCGAG CCCGTTGAAA AGACGTCGTC GTTTTATGAC CTTGACTACG GTATCCGCCG 300 

TTTGGACCAC CCCGTAGGNA ACGTCCCTGA GCTTGCTTCG GCAGTGGACT ACGTGAAATC 360 

ATTCACCGGA TTCCATGAGT TCGCCGAATT CACCGCGGAG GACGTCGGGA CGAGCGAGAG 4 20 

GGAACTGAAT TCGGTCGTTT TAGCTTGCAA. CAGTGAGATG GTCTTGATTC CGATGAA.CGA 4 80 
GCCGGTGTAC GGAANAAAAG G/^J^.GNAGCCA GAT 
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CLAIMS 

1 . An isolated nucleic acid fragment encoding a plant p-hvdroxv- 
phenylpyruvate dioxygenase enzyme, the frayment comprisinu a nucleotide 
sequence selected from the group consisting of 

""^''^"''^^ sequences encoding a polypeptide compnsmu the amino 
acid sequences set forth in SEQ ID NO:3, SEQ ID NO: 1 ! SEQ ID 
NO: 1 3, and SEQ ID NO: 1 5 and 

modified nucleotide sequences essentially snnilar to the nucleotide 
sequences of SEQ ID N0:2. SEQ ID NO 10. SEQ ID NO: 12 and 
> ^ SEQ ID NO: 1 4 containing deletions, in.senions. or substitutUDns in 

the sequence that do not affect the functional properties of the 
encoded protein. 

2. .'\n isolated nucleic acid fragment encoding a plam/p-hvdroxvphenvl- 
pyruvate d.oxygenase en.yme. the fragment composing a nucleotide sequence as 

15 set forth in SEQ ID NO: 1 4. 

3. A chimeric gene comprising the nucleic acid fragment of Claims 1 or 

- operably linked to at least one suitable regulatory sequence. 

4. The chimeric gene of Claim 3 wherein at least one suitable regulatorv 
sequence directs gene expression in a microorganism. 

5. The chimeric gene of Claim 3 wherein the at least one suitable 
regulatory sequence directs gene expression in a plant. 

6. A plasmid vector comprising the nucleic acid fragment of Claims I or 

- operablylinked to at least one suitable regulatory sequence. 

. ^. ^'■^"sio'-med host cell comprising a host cell and the plasmid vector 

-3 oi Claim 6. 

8. The transformed host cell of Claim 7 wherein the host cell is derived 
from a plant or is a microorganism. 

9. The transformed host cell of Claim 8 wherein the microorganism is 
E. coli. 

30 1 0. .A transformed plant tolerant to contact with at least one compound 

that mh.bits the rate of the reaction of /.-hydroxyphenvlpvruvate dioxygenase 
enzyme in a non-transformed plant, the transformed plant comprising the chimeric 
gene of Claim 3 and a host plant. 

11. The transfomiedplam of Claim 10 wherem the host plam is a cereal 
3:> crop plant. 

12. A method to idemify a compound useful for its abilitv to inhibit the 
rate of the reaction of;,-hydroxyphenylpyruvate dioxygenase enzyme comprising- 

(a) transforming a host cell with the plasmid %'ector of Claim 6; 
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(b) facilitating expression of the nucleic acid fragment encoding ihc 
plant p-hydroxyphenylpyriivaie dioxygcnase enzyme; 

(c) contacting the expressed enzyme from step (b) with a test 
compound: and 

5 (d) evaluating the capacity of the test compound to inhibit the rate of 

the reaction of />hydroxyphenylpyruvate dioxygenase enzyme. 

13. The method of Claim 12 wherein evaluating the capacity of the test 
compound to inhibit the rate of the reaction of p-hydroxyphenylpyruvatc 
dioxygenase enz>'me is accomplished by measuring oxygen utilization, carbon 

10 dioxide release, homogentisatc production, loss of /^-hydroxyphcnylpyruvate or 
maleylacetoacetate production, 

14. The method of Claim 12 wherein the transformed host cell is an 

E, call that comprises a chimeric gene encoding a plant /7-hydroxyphenylpyruvate 
dioxygenase enzyme. 
15 15. A compound that inhibit.s the aciiviiy of a plant /^--hydroxyphenyl- 

pyruvate dioxygenase enzyme, the compound identified by the method ot 
Claim 14. 

16- A method for impaning tolerance to a plant to at least one compound 
that mhibits the rate of reaction of p-hydroxyphenylpyruvate dioxygenase enzyme 
20 comprising: 

(a) transforming a host plant cell with a chimeric gene comprising a 
nucleic acid fragment encoding plant />hydroxyphenylpyruvate 
dioxygenase. and 

(b) expressing the chimeric gene in an amount effective to render 
25 the transformed plant substantially tolerant to the at least one 

compound that inhibits the rate of reaction of p-hydroxyphenyl- 
pyruvate dioxygenase. 
I 7, A method for the microbial production of active plant p-hydroxy- 
phenylpyruvate dioxygenase enzyme comprising: 
30 (a) stably transforming a microorganism with the chimeric gene of 

Claim 4 encoding the plant p-hydroxyphenylpyruvaie 
dioxygenase; 

(b) facilitating expression by the chimeric gene for a suitable period 
and 

35 (c) recovering active plant /7-hydroxyphenylpyruvate dioxygenase 

enzyme. 

1 8- A method to overexpress p-hydroxyphcnylpyruvate dioxygenase 
enzyme in a plant comprising: 
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(a) stably transforming a host plant cell with a chimeric DNA 
molecule comprising at least one copy of a suitable promoter to 
drive expression of an associated coding sequence in a plant cell 
operably linked to at least one copy of a homologous or 

5 heterologous coding sequence encoding />hydroxyphenyl- 

pyruvate dioxygenase: and 

(b) growing the transformed host plant cell of step (a). 

1 9. The method of Claim 1 8 wherein the chimeric DNA molecule is the 
chimeric gene of Claim 5. 

1 0 20. An isolated nucleic acid fragment comprising a member selected from 

the group consisting of: 

(a) an isolated nucleic acid fragment as set forth in SEQ ID NO: 1 6; 

(b) an isolated nucleic acid fragment that is essentially similar to an 
isolated nucleic acid fragment as set forth in Sl-Q ID NO: 16; 

•5 and 

(c) an isolated nucleic acid fragment that is complementary lo (a) or 
(b). 
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1/6 

FIG.1 



1 CAAGAAACGNGTCGNCGACGTGCTCAGCGATGATCAGATCAAGGAGTGTGAGGAATTAGG 

61 GATTCTTNTAGACAGAGATGATCAAGGGACGTTNCTTCAAATCTNCACAAAACCACTAGG 

121 TGACAGGCCGACGNTATTTATAGAGATAATCCAGAGNGTAGGATGCATGATGAAAGATGT 

181 GGAAGGGANGGCTTACCAGAGTGGAGNATNTNGTGGTTTTGGCAAAGGCAATT 
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FIG. 2 

1 TGAAATCA ATG GGCCACCAAAACGCCGCCGTTTCAGAGAATCAAAACCATGATGACGGCG 

6 1 CTGCGTCGTCGCCGGGATTCAAGCTCGTCGGATTTTCCAAGTTCGT AAGAAAGAATCCAA 

121 AGTCTGATAAATTCAAGGTTAAGCGCTTCCATCACATCGAGTTCTGGTGCGGGGACGCAA 

EC04 7III 

181 CCAACGTCGCTCGTCGCTTCTCCTGGGGTCTGGGGATGAGATTCTCCGCCAAATCCGATC 

241 TTTCCACCGGAAACATGGTTCACGCCTCTTACCTACTCACCTCCGGTGAACTCCGATTCC 

301 XTTTCACTGCTCCTTACTCTCCGTCTCTCTCCGGCGGAGAGATTAAACCGACAACCACAG 

361 GTTCTATCCCAAGTTTCGATCACGGGTCTTGTCGGTCCTTCTTCTCTTCACATGGTCTCG 

421 GTGTTAGACCCGTTGCGATTGAAGTAGAAGACGCGGAGTCAGCTTTCTCCATCAGTGTAG 

481 CTAATGGCGCTATTCCTTCGTCGCCTCCTATCGTCCTCAATGAAGCAGTTACGATCGCTG 

541 AGGTTAAACTATACGGCGATGTTGTTCTCCGATATGTTAGTTACAAAGCAGAAGATACCG 

601 AAAAATCCGAATTCTTGCCAGGGTTCGAGCGTGTAGAGGATGCGTCGTCGTTCCCATTGG 
ECOP.1 

661 ATTATGGTATCCGGCGGCTTGACCACGCCGTGGGAAACGTTCCTGAGCTTGGTCCGGCTT 

721 XAACTTATGTAGCGGGGTTCACTGGTTTTCACCAATTCGCAGAGTTCACAGCAGACGACG 

781 TTGGAACCGCCGAGAGCGGTTTAAATTCAGCGGTCCTGGCTAGCAATGATGAAATGGTTC 

Nhel 

841 TTCTACCGATTAACGAGCCAGTGCACGGAACAAAGAGGAAGAGTCAGATTCAGACGTATT 

901 TGGAACATAACGAAGGCGCAGGGCTACAACATCTGGCTCTGATGAGTGAAGACATATTCA 

961 GGACCCTGAGAGAGATGAGGAAGAGGAGCAGTATTGGAGGATTCGACTTCATGCCTTCTC 

1021 CTCCGCCTACTTACTACCAGAATCTCAAGAAACGGGTCGGCGACGTGCTCAGCGATGATC 

1081 AGATCAAGGAGTGTGAGGAATTAGGGATTCTTGTAGACAGAGATG ATCAAGGGACGTTGC 

1141 TTCAAATCTTCACAAAACCACTAGGTGACAGGCCGACGATATTTATAGAGATAATCCAGA 

1201 GAGT AGGATGCATGATGAAAGATGAGGAAGGGAAGGCTTACC AGAGTGGAGGATGTGGTG 

12 61 GTTTTGCCAAAGGCAATTTCTCTGAGCTCTTC AAGTCC ATTGAAGAATACGAAAAGACTC 

1321 TTGAAGCC AAAC AGTT AGTGGGATGAAC AAGAAGAAGAACC AACT AAAGG ATTGTGT AAT 

1381 T AATGTAAAACTGTTTT ATCTTATCAAAACAATGTAT ACAAC ATCTCATTTAAAAACGAG 
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TITLE 

PLANT GENE FOR P-HYDROXYPHENYLPYRUVATE DIOXYGENASE 

FIELD OF THE INVENTION 
This invention relates to the isolation and modification of nucleic acid 
5 encoding /?-hydroxyphenylpyruvatc dioxygenase enzyme from plants. These 
nucleic acid sequences were used to establish methods of identification of new 
herbicidal compounds that inhibit the activity of this enzyme, and to prepare new 
crop plants that are tolerant to the herbicidal action of inhibitors this enzyme. 
Chimeric genes comprising nucleic acid fragments containing all or part of the 

10 nucleic acid sequences encoding p-hydroxyphcnylpyruvate dioxygenase may be 
used to produce active plant /;-hydroxyphenylpyruvate dioxygenase enzyme in 
microorganisms, and to cause the production of modified forms of the enzyme in 
plants that may render such plants tolerant to inhibitors of the enzyme. 

BACKGROUND OF THE INVENTION 

1 5 Bleaching herbicides affect plant chloropiasts by decreasing their 

chlorophyll and carotenoid content. Several bleaching herbicides are known to 
inhibit the enzyme phytoene desaturase. resulting in the accumulation of phytoene 
in treated plants. However, compounds of the benzoyl cyclohexane- 1 ,3-dione 
type cause the accumulation of phytoene in plants but are not inhibitors of 

20 phytoene desaturase in vitro (Sandmann. G.. et al. (1990) Pestic. Sci. 30:353-355). 
Subsequent work revealed that these compounds are effective inhibitors of 
p-hydroxyphenylpyruvate dioxygenase (77-hydroxyphenylpyruvatc: oxygen 
oxidorcductase EC 1. 1 3.1 1.27), a key enz>'me in the biosynthesis of 
plastoquinones and tocopherols (Schuiz, A., et al. (1993) FEBS LetL 

25 318:1 62- 1 66). Based on the observation that phytoene desaturase requires a 
quinone as an electron acceptor, these authors postulated that by inhibiting 
p-hydroxyphenylpyruvate dioxygenase, these herbicides act indirectly on 
phytoene desaturase by blocking the biosynthesis of quinoncs. 

The proposal that /?-hydroxyphenylpyruvate dioxygenase is essential for 

30 carotenoid biosynthesis has received suppon from genetic studies in the plant 

model sy sx^m Arabidopsis thaliami. Mutations in the pdsl and pds2 genetic loci 
result in mutant plants that accumulate phytoene. However, genetic mapping of 
these mutant genes indicates that they do not correspond to the gene encoding the 
enzyme phytoene desaturase. The pdsJ mutation can be rescued by homogentisic 

35 acid, the substrate of /7-hydroxyphenylpyruvate dioxygenase. Therefore, this 
mutation corresponds to a defect in the activity of /?-hydroxyphenylpyruvate 
dioxygenase (Norris, S. R., et al. (1995) Plani Cell 7:2139-2149). 
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In light of these disclosures. /?-hydroxypheny [pyruvate dioxygenase is a 
promising new target for new herbicidal compounds. Research aimed at 
discovering new herbicides based on this mode of action would be greatly 
facilitated by the isolation of the plant gene encoding this enzyme and by the 
5 functional expression of this gene in transgenic organisms. For example, active 
enzyme produced in recombinant microorganisms could be used to establish 
screening methods for the identification of novel active compounds and to obtain 
structural and mechanistic information useful to guide further chemical synthesis. 
Furthermore, isolation of this gene would facilitate research aimed at generating 

1 0 mutant, herbicide-tolerant versions of the enzyme that may confer herbicide 
resistance to transgenic plants. 

A partial sequence of an Arahidopsis thaliana cDNA with homology to 
corresponding mammalian sequences encoding />hydroxyphenylpyruvate 
dioxygenase has been identified (GenBank Accession No. T20952). but this 

1 5 truncated sequence is insufficient to identify an active plant p-hydroxyphenyl- 
pyruvate dioxygenase. WO 96/38567 A2 addresses the utility that would be 
attached to a DNA sequence of a /7-hydroxyphcnyIpyruvate dioxygenase gene, but 
there is no biochemical evidence of function associated with the sequences 
disclosed. 

20 SUMMARY OF THE INVENTION 

This invention pertains to the isolation and characterization of nucleic acid 
fragments encoding plant p-hydroxyphenylpyruvate dioxygenase enzymes. More 
specifically, this invention pertains to isolated nucleic acid fragments encoding the 
/7-hydroxyphenylpyruvaie dioxygenase enzymes from Arahidopsis thaliana and 

25 Zea mays. 

This invention also, pertains to the production of active plant p-hydroxy- 
phenylpyruvate dioxygenase enzyme in E. coli. In one embodiment, a chimeric 
gene comprising a nucleic acid fragment encoding a polypeptide that possesses 
/7-hydroxyphenylpyruvate dioxygenase activity, operably linked to regulatory 

30 sequences that direct gene expression in £. coli. is claimed, in another 

embodiment, a plasmid vector comprising said chimeric gene is disclosed. In yet 
another embodiment, a transformed £. coli comprising a chimeric gene consisting 
of a nucleic acid fragment encoding a polypeptide that possesses p-hydroxy- 
phenylpyruvate dioxygenase activity is disclosed. 

35 This invention also pertains to a method of identifying substances that 

inhibit the rate of the reaction of p-hydroxyphenylpyruvate dioxygenase enzyme. 
In one embodiment, the invention pertains to an assay for the detection of 
inhibitors of p-hydroxyphenylpyruvate dioxygenase wherein a polypeptide 
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derived from a transformed £. coli that displays /;-hydroxyphenylpyruvate 
dioxygenase activity is incubated in the presence of a test substance. Following 
incubation. /^-hydroxyphenylpyruvate dioxygenase enzymatic activity is measured 
wherein a reduction of enzymatic activity is indicative of the inhibitory capacity 
5 of the lest substance. Enzymatic activity can be measured by any appropriate 
means, including but not limited to oxygen utilization, carbon dioxide release, 
homogentisate production, and loss of /?-hydroxyphenylpyruvate. Results are 
quantified by radiometric, colorimctric or chromatographic means. 

In another embodiment, this invention pertains to plants that are 

10 substantially tolerant to the application of at least one compound that inhibits the 
rate of the reaction of /7-hydroxyphcnylpyruvate dioxygenase. Plants may be 
rendered tolerant by overexpression of the wild-iype /?-hydroxyphenylpyruvate 
dioxygenase, by expression of a naturally-occuring resistant variant of this 
enzyme, or by expression of an altered form of /7-hydroxyphenylpyruvate 

1 5 dioxygenase that is resistant to the action of compounds that are inhibitory to the 
wild-type enzyme. 

A further embodiment of the invention is an isolated nucleic acid fragment 
comprising a member selected from the group consisting of: 

(a) an isolated nucleic acid fragment as set forth in SEQ ID NO: 16: 
20 (b) an isolated nucleic acid fragment that is essentially similar to an 

isolated nucleic acid fragment as set forth in SEQ ID NO: 16; 
and 

(c) an isolated nucleic acid fragment that is complementary to (a) or 
(b). 

25 

BRIEF DESCRIPTION OF THE 
DRAWINGS AND SEQUENCE DESCRIPTIONS 
The invention can be more fully understood from the following detailed 
description and the accompanying drawings and the sequence descriptions which 
30 form a part of this application. 

Figure I presents a partial nucleic acid sequence of an expressed sequence 
tag (EST) bearing GenBank Accession No. T92052 obtained from an Arahidopsis 
thaliana cDNA library. This sequence was contained in clone 9 1 B 1 jT7 of the 
library. 

35 Figure 2 presents the nucleic acid sequence of the cloned cDNA encoding a 

full-length form oi Arabidopsis (haliana /?-hydroxyphenylpyruvate dioxygenase 
enzyme, as it was initially determined (SEQ ID NO:2). Translation start and stop 
codons are underlined. Selected restriction sites are indicated. 
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Figure 3 presents the amino acid sequence comparison between full-length 
p-hydroxyphenylpyruvate dioxygenases from Arabidopsis thaliana (SEQ ID 
NO: 1 5) and Zea mays (SEQ ID NO: 1 1 ) and the p-hydroxyphenyipyruvate 
dioxygenase enz\'mes derived from human (SEQ ID N0:6, GenBank Acc. 
No. U29895), pig (SEQ ID NO:7, GenBank Acc. No. D13390), mouse (SEQ ID 
NO:8, GenBank Acc. No. D29987) and rat (SEQ ID NO:9, GenBank Acc. 
No. Ml 8405). Asterisks indicate amino acid residues that arc conserved across all 
six species. This figure was created using the Pileup program of GCG (Program 
Manual for the Wisconsin Package, Version 9.0-OpcnVMS, December 1996, 
Genetics Computer Group, 575 Science Drive, Madison, WI, USA 5371 1). 

Figure 4 is a diagram describing the construction of the intermediate 
plasmid vector pT7BlueR + PDOl. 

Figure 5 is a diagram describing the construction of ZT. coli expression 
vector pE24CPl. 

Applicants have provided a sequence listing in conformity with ''Rules for 
the Standard Representation of Nucleotide and Amino Acid Sequences in Patent 
Applications'^ (Annexes I and II to the Decision of the President of the EPO, 
published in Supplement No. 2 to OJ EPO, 12/1 992) and with 37 C.F.R. 
1.821-1.825 and Appendices A and B ("Requirements for Application Disclosures 
Containing Nucleotides and/or Amino Acid Sequences"). 

SEQ ID NO:l presents a partial nucleic acid sequence of an expressed 
sequence tag (EST) bearing GenBank Accession No. T92052 obtained from an 
Arabidopsis thaliana cDNA library. This sequence was contained in clone 
91B13T7ofthe Hbrary. 

SEQ ID NO:2 presents the initial determination of the nucleic acid sequence 
and the deduced amino acid sequence of a cDNA encoding a full-length form of 
Arabidopsis thaliana /7-hydroxyphenylpyruvate dioxygenase enzyme, as 
contained in plasmid pGBPPD2. 

SEQ ID NO:3 presents the initially deduced amino acid sequence encoded 
by a cDNA for Arabidopsis thaliana /7-hydroxyphenylpyruvate dioxygenase 
en2:yme. 

SEQ ID NOS:4 and 5 present the nucleotide sequences of a pair of 
complementary oligonucleotides (CAM 32 and CAM 33. respectively) used to 
facilitate subcloning and expression of the gene encoding p-hydroxyphenyl- 
pyruvate dioxygenase without the chloroplast transit sequence. 

SEQ ID N0:6 presents the amino acid sequence of /?-hydroxyphenyl- 
pyruvate dioxygenase enzyme derived from human (GenBank Acc. No. U29895). 
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SEQ ID NO:7 presents the amino acid sequence of>-hydroxyphenyl- 
pyruvate dioxygenase enzyme derived from pig (GenBank Acc. No. D 13390). 

SEQ ID NO:8 presents the amino acid sequence of/7-hydroxyphenvl- 
pyruvate dioxygenase enzyme derived from mouse (GenBank Acc. No. D29987). 
5 SEQ ID NO:9 presents the amino acid sequence of /?-hydroxyphenyl- 

pyruvate dioxygenase enzyme derived from rat (GenBank Acc. No. Ml 8405). 

SEQ ID NO: 10 presents the nucleic acid sequence and deduced amino acid 
sequence of the cloned cDNA encoding the Zca mays /?-hydroxyphenyIpyruvate 
dioxygenase enzyme, as contained in plasmid pMPDO. 
10 SEQ ID NO: 1 1 presents the deduced amino acid sequence of the cloned 

cDNA encoding the Zea mays /?-hydroxyphenylpyruvatc dioxygenase enz>'me, as 
contained in plasmid pMPDO. 

SEQ ID NO: 12 presents the nucleic acid sequence and the deduced amino 
acid sequence of the truncated form oi Arabidopsis thaUana /?-hydroxyphenvl- 
1 5 pyruvate dioxygenase enzyme as contained in pE24CPl . 

SEQ ID NO: 1 3 presents the deduced amino acid sequence of the truncated 
form of Arabidopsis ^Aa/zc/wap-hydroxyphenylpyruvate dioxygenase enz>'me as 
contained in pE24CPl. 

SEQ ID NO: 14 presents the revised nucleic acid sequence and the deduced 
20 amino acid sequence of the cloned cDNA encoding the full-length Arabidopsis 
thaliana p-hydroxyphenylpyruvaie dioxygenase enzyme, as contained in plasmid 
pGBPPD2. 

SEQ ID NO: 1 5 presents the revised amino acid sequence deduced from the 
cDNA for the full length Arabidopsis (haliana /7-hydroxyphenylpyruvate 
25 dioxygenase enzyme. 

SEQ ID NO: 1 6 presents the nucleic acid sequence determined from a 
portion of a cDNA from Vernonia galamenensis. as contained in clone 
vsl.pk0015.b2. 

DETAILS OF THE INVENTION 
30 BIOLOGICAL DEPOSITS 

The following biological materials have been deposited under the terms of 
the Budapest Treaty at American Type Culture Collection (ATCC), 12301 
Parklawn Drive, Rockville, MD 20852, and bear the following accession 
numbers: 
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Host Strain 
£. CO// BL21(DE3) 
N/A 



Depositor Identification 



Plasmid 



IntM. Depository 
Accession Number 



Date of Deposit 



N/A 



pE24CPl 
pGBPPD2 
pMPDO 



ATCC 98083 
ATCC 97622 
ATCC 209120 



June 25, 1996 
June 25, 1996 



June 12, 1997 



10 



15 



20 



25 



Definitions 

In the context of this disclosure, a number of terms shall be utilized. As 
used herein, the term ''nucleic acid" refers to a large molecule which can be 
single-stranded or double-stranded, composed of monomers (nucleotides) 
containing a sugar, phosphate and either a purine or pyrimidine. A ''nucleic acid 
fragment" is a portion of a given nucleic acid molecule. As used herein. "DNA'" 
(deoxyribonucleic acid) is the genetic material, whereas ^'RNA** (ribonucleic acid) 
is involved in the transfer of the information encoded by the DNA into proteins 
and polypeptides. A '^genome" is the entire body of genetic material coniained in 
each cell of an organism. The term "nucleotide sequence"' refers to a polymer oi' 
DNA or RN A which can be single- or double-stranded, optionally containing 
synthetic, non-natural or altered nucleotide bases capable of incorporation into 
DNA or RNA polymers. 

As used herein, "essentially similar' refers to DNA sequences that may 
involve base changes that do not cause a change in the encoded amino acid or 
which involve base changes which may alter one or more amino acids, but do not 
affect the functional properties of the protein encoded by the DNA sequence. It is 
therefore understood that the invention encompasses more than the specific 
exemplary sequences. Modifications to the sequence, such as deletions, 
insertions, or substitutions in the sequence which produce "silent changes" (i.e., 
those that do not substantially affect the functional properties of the resulting 
protein molecule) are also contemplated. For example, alieration(s) in the gene 
sequence which reflects the degeneracy of the genetic code, or which result in the 
production of a chemically equivalent amino acid at a given site, are 
contemplated; thus, a codon for the amino acid alanine, a hydrophobic amino acid, 
may be substituted by a codon encoding another less hydrophobic residue, such as 
glycine, or a more hydrophobic residue, such as valine, leucine, or isoleucine. 
Similarly, changes which result in substitution of one negatively charged residue 
for another, such as aspartic acid for glutamic acid, or one positively charged 
residue for another, such as lysine for argininc. can also be expected to produce a 
biologically equivalent product. Nucleotide changes which result in alteration of 
the N-terminal and C-terminal portions of the protein molecule would also not be 
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expected to alter the activity of the protein. In some cases, it may in fact be 
desirable to make mutants of the sequence in order to study the effect of alteration 
on the biological activity of the protein. Each of the proposed modifications is 
well within the routine skill in the art, as is determination of retention of 
5 biological activity of the encoded products. Moreover, the skilled artisan 

recognizes that "essentially similar'' sequences encompassed by this invention are 
also defined by their ability to hybridize, under stringent conditions (O.IX SSC. 
0.1% SDS, with the sequences exemplified herein. 

"Gene" refers to a nucleic acid fragment that encodes a specific protein, 
10 including regulatory sequences preceding (5' non-coding) and following (3* non- 
coding) the coding region. "Native" gene refers to the gene as found in nature 
with its own regulator>' sequences. "Chimeric" gene refers to a gene comprising 
heterogeneous regulatory and coding sequences. "Endogenous" gene refers to the 
native gene normally found in its natural location in the genome. A "foreign'' 
I 5 gene refers to a gene not normally found in the host organism but that is 
introduced by gene transfer. 

"Coding sequence" refers to a DNA sequence that codes for a specific 
protein and excludes the non-coding sequences. 

"Initiation codon'' and "termination codon" refer to a unit of three adjacent 
20 nucleotides in a coding sequence that specifies initiation and termination, 

respectively, of protein synthesis (mRNA translation). "Open reading frame" 
refers to the amino acid sequence encoded between translation initiation and 
termination codons of a coding sequence. 

"RNA transcript" refers to the product resulting from RNA poiymerase- 
25 catalyzed transcription of a DNA sequence. When the RNA transcript is a perfect 
complementary copy of the DNA sequence, it is referred to as the primary 
transcript or it may be a RNA sequence derived from posttranscriptional 
processing of the primary transcript. "Messenger RNA" (mRNA) refers to RNA 
that can be translated into protein by the cell. "cDNA" refers to a double-stranded 
30 DNA, one strand of which is complementary to and derived from mRNA by 
reverse transcription. "Sense RNA" refers to RNA transcript that includes the 
mRNA. 

As used herein, "regulatory sequences'' are nucleotide sequences that control 
the transcription or expression of a coding sequence located upstream ( 5'), within. 
35 or downstream (T) to the coding sequence, act in conjunction with the protein 
biosynthetic apparatus of the cell and include promoters, translation leader 
sequences, transcription termination sequences, and polyadenylation sequences. 
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'Tromoier" refers to a DNA sequence in a gene, usually upstream (5') to its 
coding sequence, which controls the expression of the coding sequence by 
providing the recognition for RNA polymerase and other factors required for 
proper transcription. A promoter may also contain DNA sequences that are 
5 involved in the binding of protein factors which control the effectiveness of 

transcription initiation in response to physiological or developmental conditions. 
In the case of eukar>'otic organisms, it may also contain enhancer elements. 

An ''enhancer element" is a DNA sequence which can stimulate promoter 
activity. It may be an innate element of the promoter or a heterologous element 

10 inserted to enhance the activity level and tissue-specificity of a promoter. 
''Constitutive promoters" refer to those enhancer elements that direct gene 
expression in all tissues and at all times. ''Organ-specific'' or "development- 
specific" promoters as referred to herein are those that direct gene expression 
almost exclusively in specific organs, such as leaves or seeds, or at specific 

1 5 development stages in an organ, such as in early or late embr>^ogenesis, 
respectively. 

The term '^operably linked" refers to nucleic acid sequences on a single 
nucleic acid molecule which are associated so that the function of one is affected 
by the other. For example, a promoter is operably linked with a structural gene 

20 (i.e., a gene encoding /^-hydroxyphenylpyruvate dioxygenasc. as disclosed herein) 
when it is capable of affecting the expression of that structural gene (i.e., that the 
stmctural gene is under the transcriptional control of the promoter). 

The term '"expression", as used herein, is intended to mean the production of 
the protein product encoded by a gene. More particularly, "expression" refers to 

25 the transcription and stable accumulation of the sense RNA (mRJMA) derived from 
the nucleic acid fragment(s) of the invention that, in conjuction with the protein 
apparatus of the cell, results in altered levels of protein product, 
"Overexpression" refers to the production of a gene product in transgenic 
organisms that exceeds levels of production in normal or non-transformed 

30 organisms. "Altered levels" refers to the production of gene product(s) in 

transgenic organisms in amounts or proportions that differ from that of normal or 
non-transformed organisms. "Facilitating expression" refers to steps and 
conditions for culturing host cells containing the desirable gene to yield an 
increased production of the enzyme. For example, addition of a chemical inducer 

35 specific to the particular promoter operably linked to the gene facilitates 

expression of the encoded enzyme. This is measured relative to the production 
levels of an untreated gene. 
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The ''3' non-coding sequences" refers to ihc DNA sequence ponion of a 
gene that contains a poiyadenylation signal and any other regulator\' signal 
capable of affecting mRNA processing or gene expression. The poiyadenylation 
signal is usually characterized by affecting the addition of polyadenylic acid tracts 
5 to the 3' end of the mRNA precursor. 

The ''translation leader sequence'' refers to that DNA sequence portion of a 
gene between the promoter and coding sequence that is transcribed into RNA and 
is present in the fully processed mRNA upstream (5') of the translation start 
codon. The translation leader sequence may affect processing of the primary 
10 transcript to mRNA, mRNA stability, or translation efficiency. 

'Transformation'' herein refers to the transfer of a foreign gene into the 
genome of a host organism and its genetically stable inheritance. Bacterial 
transformation can proceed by any of several methods well known in the art, 
including calcium chloride-mediated transformation and electroporation. 
1 5 Examples of methods of plant transformation include Agrohacrcrium-mcdiaicd 
transformation and particle-accelerated or "gene gun" transformation technology 
(U.S. Patent No. 4,945,050). 

"Host cell" refers to the cell that is transformed with the introduced genetic 
material. 

20 *Tlasmid vector' refers to a double-stranded, closed circular, extra- 

chromosomal DNA molecule. 

'Tolerant" or "tolerance" refers to a condition whereby a cell or an organism 
is able to withstand the effect of application of a compound or composition at a 
concentration or application rate that causes a demonstrable effect in or against 

25 ceils or organisms that are not tolerant. For example, the growth or survival of a 
plant that is tolerant to application of a hcrbicidal compound or composition will 
be less affected than the growth or survival of a plant that is not tolerant to 
application of the herbicidal compound or composition. 
Cloning of Plant Genes Encoding p-Hvdroxyphcnvlpvruvate Dioxvgenase 

30 The p-hydroxyphcnylpyruvate dioxygenases from plants are a promising 

new class of targets for new herbicidal compounds. In order to be able to study 
this enzyme in detail, and to have available supplies of enzyme for inhibitor 
screening, cDNA clones encoding plant /7-hydroxyphenylpyruvate dioxygenases 
were identified. These nucleic acid fragments are useful for the production of 

35 their encoded enzymes, for isolation of clones from additional plant sources that 
encode other /7-hydroxyphenylpyruvate dioxygenase enzymes, and for 
understanding the biochemical and structural properties of these enz>'mes. 
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Nucleic acid fragments comprising nucleotide sequences that encode 
different forms of the enzyme p-hydroxyphenylpyruvate dioxygenase from the 
plant Arabidopsis thaliana have now been isolated. Subsequently, these 
nucleotide sequences were expressed in E. coli cells and shown to direct the 
5 synthesis of plant /?-hydroxyphenylpyruvate dioxygenase enzymes. 

An automated search of nucleotide sequences contained in a database 
representing an Arabidopsis cDNA library for sequences homologous to other 
known, non-plant p-hydroxyphenyipyruvate dioxygenase genes revealed the 
plasmid cDNA clone 91B13T7. This cDNA was obtained from the Arabidopsis 

1 0 Seed Stock Center at Ohio State University. Plasmid DN A suitable for nucleotide 
sequence determination was prepared and the nucleotide sequence of the plasmid 
insert was determined. The resulting sequence was not interpretable. suggesting 
possible contamination of the plasmid sample by an extraneous nucleic acid. This 
assumption was confirmed by digesting the plasmid DNA sample with restriction 

15 enzymes and separating the resulting nucleic acid fragments by agarose gei 

electrophoresis. This analysis revealed the presence of nucleic acid fragments that 
could not be derived from the plasmid carrying the putative p-hydroxyphenyl- 
pyruvate dioxygenase fragment. Furthermore, a search of the publically available 
nucleic acid sequence databases revealed that the Arabidopsis thaliana sequence 

20 reported for cDNA clone 91B13T7 corresponded to a truncated cDNA (Figure 1 ). 
Based on publically available mammalian cDNA sequence information for 
/5-hydroxyphenylpyruvate dioxygenase, the minimum length expected for a cDNA 
encoding a complete /7-hydroxyphenylpyruvate dioxygenase enzyme is 1 kb 
(Table 1). 

25 

Table 1 

Predicted cDNA Length for Sequences 
Encoding p-Hydroxyphenylpyruvate Dioxygenase 





Amino Acid 




Organism 


Residues 


Minimum cDNA (kb) 


Human 


392 


1.176 


Pig 


392 


1.176 


Pseudomonas sp. 


357 


1.071 



30 

Therefore, based on the expected length of a cDNA capable of encoding a 
functional p-hydroxyphenylpyruvate dioxygenase, the Arabidopsis ihaliana 
sequence obtained from the public database was insufficient to encode a full- 
length, active /7-hydroxyphenylpyruvate dioxygenase enzyme. Therefore, a cDNA 
35 with the capacity to encode a full-length enzyme Arabidopsis thaliana was cloned, 

10 
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as described herein. .A 400 bp segment of the insert of plasmid 9IB13T7 was 
liberated by digestion with restriction enz>'mes and used to screen a cDNA library 
prepared from norflurazon-ireated Arahidopsis ihaliana seedlings (Scolnik. P. A., 
and Bartley, G. E. (1994) Plant Physiol. 104:1469-1470). Several clones showing 
5 positive hybridization to this probe were sequenced. The initial determination of 
the sequence of the longest cDNA clone obtained from this effort is shown in 
Figure 2 and in SEQ ID NO:2. During the course of subsequent work with this 
clone it became necessary to confirm certain features of the sequence. A corrected 
sequence of this cDNA is presented in SEQ ID NO: 12. 
10 The sequence reported in Figure 2 indicates that this cDNA has the capacity 

to encode a protein of MW 48,841 which, as shown in Figure 3. has a high level 
of homology to /?-hydroxyphenylpyruvate dioxygenase enzymes from other 
eukaryotes. 

A cDNA capable of encoding a full-length p-hydroxyphenylpyruvate 
1 5 dioxygenase has also been obtained from corn. This cDNA. contained in plasmid 
pMPDO, was identified in a corn cDNA library using an approximately 900 base 
pairs portion of the Arahidopsis cDNA as a probe. The predicted amino acid 
sequence that is encoded by the corn cDNA is also compared to />hydroxypheny- 
Ipyruvate dioxygenase enzymes from other eukar>'otes in Figure 3. 

20 A cDNA library was prepared from messenger RNA isolated from 

developing seeds of Vernonia galamenensis. Random sequencing of the clones 
contained in the library identified a probable clone, designated vsl .pkOOl 5.b2. for 
the /7-hydroxyphenylpyruvate dioxygenase from this plant. The 5 1 3 bp expressed 
sequence tag (EST) is presented in SEQ ID NO: 16. 

25 Expressio n of the Arahidopsis thaliana cDNA Encodine ^-Hydroxvphenvl- 
Dvruvate Dioxvgenasc in E. coli 

The nucleic acid fragments of the instant invention encoding a plant 
/7-hydroxyphcnylpyruvate dioxygenase enzymes can be operably linked to suitable 
regulatory sequences, thereby creating chimeric genes that can be used to direct 

30 expression of the enzyme in transgenic organisms. These transgenic organisms 
include, but are not limited to: plants (Plant Molecular Biology- Croy, R. R. D.. 
Ed.; Bios Scientific Publishers; 1993); microorganisms, including Escherichia 
coli (Gold, L. (1990) Methods in Enzymology-' 1 85: 11), Bacillus subtilis (Henner, 
D. J. (1990) Methods in Enzymology^ 185:199), yeast (Gellissen, G., et al. (1992) 

35 Antonie Leeuwenhoek 62:79), and fungi, including members of the genus 

Aspergillus (Devchand, M. and Gwynne, D. I. (1 991) J. Biotechnol. 17:3); and 
insect cells containing recombinant baculoviruses (Lukow, V. A. and Summers, 
M. D. (1988) Bio/Technology 6:47). 
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One skilled in the art can isolate the coding sequences from the fragments of 
the invention by using or creating sites for restriction endonucleases. as described 
in Sambrook, J., ct aL((1989) Molecular Cloning, A Laboratory Manual, 2nd ed.: 
Cold Spring Harbor Laboratory Press; hereinafter ''Maniatis'')- Alternatively, 
5 polymerase chain reaction (PCR) techniques can be employed to isolate and/or 
modify the fragments of the invention (Newton, C. R. and Graham, A. (1994) 
PCR; Bios Scientific Publishers). 

^raA/^/o/?5'/A77-hydroxyphenylpyruvate dioxygenase was expressed in £. coli 
under control of a T7 promoter in a strain expressing T7 RNA polymerase 

10 (Studier, F. W., et aL (1990) Methods in Enzymology> 185:60). Promoters other 
than T7 are commonly used in expression vectors and could be substituted for 
protein expression in £. coli. Examples of alternative promoters include, but are 
not limited to, trp (Yansura, D. G. and Henner. D. J. (1990) Methods in 
Enzymology 185:54), P^ (Remaut, E. et al. (1981) Gene 15. 81), tac (Amann. E. et 

15 al. (1983) Gene 25:167), trc (Amann. E. et al. (1988) Gene 69:301 ), and 

promoters such as lacUVS, Ipp, Pr^, and hybrid and tandem promoters constructed 
to combine specific features to increase strength or regulation capacity (Balbas. P. 
and Bolivar, F. (1990) Methods in Enzymology 185:14). 
Biochemical Evidence of Enzymatic Function 

20 The enzyme p-hydroxyphenylpyruvate dioxygenase catalyzes the reaction of 

/7-hydroxyphenylpyruvate with molecular oxygen to give homogentisate and COt. 
The enzyme can be assayed by measuring oxygen utilization (Hager, S. E., et al. 

(1957) ./ Biol, Chem, 225:935-947), CO2 release or homogentisate production 
from radioactive labeled p-hydroxyphenylpyruvate (Lindblad, B. (1971) Clin. 

25 Chem. Acta 34:1 13-121), loss of the p-hydroxypheny [pyruvate (Lin. E. C. C. ct al. 

(1958) ./ Biol Chem, 233:668-673), or formation of homogentisate using a 
colorimetric assay (Fellman, J. H. et al. (1972) Biochim. Biophys. Acta 
284:90-100) or UV detection following HPLC or a similar chromatographic 
separation technique. The activity of p-hydroxyphenylpyruvatc dioxygenase may 

30 also be measured in a coupled assay in which the initial product, homogentisate. is 
oxidized by homogentisate dioxygeneise; formation of maleylacetoacetate 
determined by measuring absorbance at 330 nm (Fernandez-Canon. J. M. and 
Penalva. M. A. {1991) Anai Biochem. 245:218-221). 

An alternative to any of the kinetic assays for /7-hydroxyphenylpyruvate 

35 dioxygenase is an end-point or fixed-time assay. The procedure is based on the 
conversion of unconverted substrate, /7-hydroxyphenylpyruvate to its enediol 
tautomer by tautomerase in the presence of borate ions and measurement of the 
characteristic 308 nm peak of the tautomer (Lin. E. C. C. et al. (1958) J. Bioi 
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Chem. 233:668-673). The procedure involves the addition of enough 
/7-hydroxyphenylpyruvate dioxygenase to consume --80% of the organic substrate 
over 1 hour in 200 \xL of assay buffer, which in this case is a 50 mM Tris, pH 7.4, 
0.10 mM p-hydroxyphenylpyruvic acid, 1.75 mM ascorbate and 1.25 mM EDTA. 
5 After 1 hr the reaction is quenched by the addition of 100 juL of 0.8 M borate, 

pH 7.3, containing 1000 ppb of a /?-hydroxyphenylpyruvate dioxygenase inhibitor 
and 0.25 fiL of 6.1 mg/mL of tautomerase. The absorbance at 308 run is read after 
a 30 min incubation and is stable thereafter for 2 hr. The advantage of this assay 
over the kinetic procedure is that the p-hydroxyphenylpyruvate dioxygenase is not 
1 0 required to oxidize the substrate in the presence of high concentrations of borate, a 
condition that might interfere with the mode of action of inhibitors. Furthermore 
the assay produces essentially a stable binary indication of p-hydroxypheny- 
Ipyruvate dioxygenase inhibition, and is well-suited for applications which require 
a high-throughput of samples and assays. 
15 The enzyme encoded by the nucleic acid fragments and overexpressed in 

£ coli can be extracted in any conventional buffer used for extracting soluble 
plant enz>'mes. Although a large amount of an overexpressed protein is often 
insoluble, the amount that is soluble represents can represent as much as 50% of 
the total soluble protein. Soluble overexpressed protein has high /?-hydroxy- 
20 phenyipyruvate dioxygenase activity and is easily extracted. Likewise, it may be 
possible to resolubilize an insoluble overexpressed protein in an active form under 
appropriate conditions, since addition of sarkosyl (sodium N-lauroylsarcosinate) 
to the extraction buffer appeared to increase the amount of the overexpressed 
protein extracted. For optimum activity, a reducing agent such as ascorbafe or 
25 reduced glutathione should be present as well as a source a ferrous ion. 

An overexpressed enzyme can be assayed using all the techniques 
described above for measuring p-hydroxyphenylpyruvate dioxygenase activity, 
while only the techniques using labeled /?-hydroxyphenylpyruvate can be used to 
measure activity in crude plant extracts. Therefore, the availability of an 
30 overexpressed enzyme greatly facilitates the development of high capacity screens 
to identify inhibitors of the enzyme. Potential inhibitors are evaluated for their 
capacity to reduce the rate of the reaction of the enzyme, resulting in reduced 
oxygen uptake and COj release, and lower rates of formation of homogentisate 
and loss of p-hydroxyphenylpyruvate. Applicants have demonstrated that at least 
35 one of the instant nucleic acid fragments can be overexpressed in E. coli cells, 
resulting in production of a protein that catalyzes the conversion of p-hydroxy- 
phenyjpyruvate to homogentisate with the release of CO2. Furthermore, it has 
been shown that this activity is inhibited by commercial herbicides known to 
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inhibit /?-hydroxyphenylpyruvate dioxygcnase. Finally, an overexpressed enzyme 
can be used in a high capacity assay to identify compounds that inhibit the 
enzymatic activity of/?-hydroxyphenylpyruvate dioxygenasc. Such compounds 
may serve as herbicides. 

Preparation of Plants Tolerant to Inhibitors of ;?-Hvdroxv nhenvlnvruvate 
Dioxvgenase 

This invention embodies plants which are resistant or at least tolerant to 
herbicides that target the p-hydroxyphenylpyruvate dioxygenase enzyme at levels 
which are normally inhibitory to the naturally occurring p-hydroxyphenylpyruvatc 
dioxygenase enzyme. This altered /?-hydroxyphenylpyruvate dioxygenase activity 
is conferred by (1 ) overexpression of the wild-type />hydroxyphenylpyruvaie 
dioxygenase enzyme, or (2) expression of a DNA molecule encoding a herbicide- 
tolerant cnz\'mc. The said enzyme may be a modified form of an/?-hydroxy- 
phenylpyruvate dioxygenase enzyme that occurs naturally in a eukaryote or 
prokaryoie. or a modified form of an /7-hydroxypheny [pyruvate dioxygenase 
enz>TOe that naturally occurs in a plant, or a herbicide tolerant enzyme that 
naturally occurs in a prokaryote (Duke et al. Herbicide Resistant Crops: Lewis: 
Boca Raton: 1 994). An effective amount of gene expression to render the cells of 
the plant tissue substantially tolerant to the herbicide depends on whether the gene 
codes for an unaltered /7-hydroxyphenylpyruvate dioxygenase gene or a mutant or 
altered form of the gene that is less sensitive to the herbicides. Expression of an 
unaltered plant p-hydroxyphenylpyruvate dioxygenase gene in an effective 
amount is that amount that provides for a 2- to 10-fold increase in herbicide 
tolerance. Plants encompassed by the invention include monocotyiedoncous and 
dicotyledoneous plants. Preferred are those plants w-iiich would be potential 
targets for /7-hydroxyphenylpyruvate dioxygenase-inhibiting herbicides, 
particularly agronomically important crops such as maize and other cereal crops. 

Increased levels of expression of /7-hydroxyphenylpyruvaie dioxygenase 
activity, from two to ten or more times the natively expressed amount, would be 
sufficient to overcome growth inhibition caused by the herbicide. Plants 
containing such altered p-hydroxyphenylpyruvate dioxygenase enzyme activity 
can be obtained by direct selection in plants. This method is known in the art. 
See, e.g., U.S. Patent No. 5,162,602. U.S. Patent No. 4J6K373, and references 
cited therein. 

Overexpression of /^-hydroxyphenylpyruvate dioxygenase also can be 
accomplished by stably transforming a host plant cell with a chimeric DNA 
molecule comprising a promoter capable of driving expression of an associated 
coding sequence in a plant cell and opcrably linked to a homologous or 
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heterologous coding sequence encoding p-hydroxyphcnylpyruvate dioxygenasc. 
A "homologous'>-hydroxyphenylpyruvate dioxygenase gene is isolated from an 
organism laxonomically identical to the target plant cell, whereas a "heterologous" 
/7-hydroxyphenylpyruvate dioxygenase gene is obtained from an organism 
5 taxonomically distinct from the target plant. 

The expression of foreign genes in plants is well-established (De Blaere ei 
al.. (1987) A/e//?. Enzymoi 143:277-291). Promoters utilized to drive gene 
expression in transgenic plants or plant cells (i.e.. those capable of driving 
expression of the associated coding sequences such as p-hydroxyphenylpyruvate 
10 dioxygenase in plant cells, include those directing the 19S and 35S transcripts in 
Cauliflower mosaic virus (Odell et al., (1985) /Vorfvr^ 3 13:810-812: Hull et al.. 
(1 987) Virology 86:482-493), small subunit of ribulose 1 .5-bisphosphate 
carboxylase (Morelli et al., (1985) Nature 315:200-204: Broglie et al.. (1984) 
Science 224:838-843; Hererra-Estrella et al.. (1984) Ncnurc 319:1 15-120; Coruzzi 
15 et al., (1984) EMBOJ. 3:1671-1679; Faciotti et al.. (1985) Bio/Tcchnology^ 3:241 
and chlorophyll a/b binding protein (Lamppa et al., (1986) Namrc 316:750-752): 
nopaline synthase promoters (Depicker et al, (1982) J. Mol. App. Genet. 
7:561-573: An et al. (1990) Plant Ce// 2:225-233). The chimeric DNA 
construct(s) of the invention may contain multiple copies of a promoter or 
20 multiple copies of the /:>-hydroxyphenyl pyruvate dioxygenasc coding sequences. 
In addition, the construct(s) may include coding sequences for selectable markers 
and coding sequences for other peptides such as signal or transit peptides. The 
preparation of such constructs is within the ordinary level of skill in the art. 
Resistance to inhibitors of the plant caroienoid biosynthesis pathway, which is 
25 also targeted by /^hydroxyphenylpyruvate dioxygenase inhibitors, has been 

achieved by expressing a bacterial gene encoding phytoenc desaturase driven b\- 
the CaMV promoter (Misawa et al., (1994) Plant, J. V:48 1-490). 

Transit peptides may be fused to the /7-hydroxyphenylpyruvate dioxygenase 
coding sequence in the chimeric DNA constructs of the invention lo direct 
30 transport of the expressed /;-hydroxyphenylpyruvate dioxygenase enzyme to the 
desired site of action. Examples of transit peptides include the chloroplast transit 
peptides such as those described in Von Heijne et al., (1991) Plant Mol. Biol. Rep. 
9:\04-\26\ MazuT a Sil,(l9S7) Plant Physiol. 85:1110; Vorst et a!., ( 1988) Gcv7tf 
65:59; and mitochondrial transit peptides such as those described in Boutr>' et al.. 
35 (1987) Nature 328:340-342. 

It is envisioned that the introduction of enhancers or enhancer-like elements 
into other promoter constructs will also provide increased levels of primary 
transcription to accomplish the invention. These would include viral enhancers 
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such as that found in the 35S promoter (Odell et aL, (1988) Plant Moi Biol. 
10:263-272), enhancers from the opine genes (Tromm et aL, (1989) Plant Cell 
1 :977-984), or enhancers from any other source that result in increased, 
transcription when placed into a promoter opcrably linked to the nucleic acid 
5 fragment of the invention. 

Introns isolated from the maize Adh-1 and B7-I genes (Callis et al., (1987) 
Genes Dev. 1:1 183-1200), and intron 1 and exon 1 of the maize Shrunken-) (sh-1) 
gene (Maas et al.. (199\) Plant MoL Biol. 16:199-207) may also be of use to 
increase expression of introduced genes. Results with the first intron of the maize 

10 alcohol dehydrogenase (Adh-l) gene indicate that when this DNA element is 

placed within the transcriptional unit of a heterologous gene, mRNA levels can be 
increased by 6.7-fold over normal levels. Similar levels of intron enhancement 
have been obser\'ed using intron 3 of a maize actin gene (Luehrsen. K. R. and 
Walbot, v., (\99\) Moi Gen. Genet. 225:81-93). Enhancement of gene 

15 expression by Adhl intron 6 (Oard et al.. (1989) Plant Cell Rep 8:156-160) has 
also been noted. Exon 1 and intron 1 of the maize sh-1 gene have been shown to 
individually increase expression of reporter genes in maize suspension cultures by 
10 and 100-fold, respectively. When used in combination, these elements have 
been shown to produce up to 1000-fold stimulation of reporter gene expression 

20 {MdiSiS^idA,.{\99\) Plant Moi Biol. 16:199-207). 

Any 3' non-coding region capable of providing a polyadenylation signal and 
other regulatory sequences that may be required for proper expression can be used 
to accomplish the invention. This would include the 3' end from any storage 
protein such as the 3' end of the lOkd, 15kd, 27kd and alpha zcin genes, the 3' end 

25 of the bean phaseolin gene, the 3' end of the soybean fi-conclycinin gene, the 3' 
end from viral genes such as the 3' end of the 35S or the 19S caulitlower mosaic 
virus transcripts, the 3' end from the opine synthesis genes, the 3' ends of ribulose 
1,5-bisphosphate carboxylase or chlorophyll a/b binding protein, or 3' end 
sequences from any source such that the sequence employed provides the 

30 necessary regulatory information within its nucleic acid sequence to result in the 
proper expression of the promoter/coding region combination to which it is 
operably linked. There are numerous examples in the art that teach the usefulness 
of different 3' non-coding regions (for example, sec Ingelbrechi et al., (1989) 
Plant Cell 1:671-680). 

35 Various methods of introducing a DNA sequence (i.e.. of transforming) into 

eukaryotic cells of higher plants are available to those skilled in the art (see EPO 
publications 0 295 959 A2 and 0 138 341 Al). Such methods include high- 
velocity ballistic bombardment with metal particles coated with the nucleic acid 
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constructs (see Klein et al., (1987) Nature (London) 327:70-73. and see U.S. 
Patent No. 4,945,050), as well as those based on transformation vectors based on 
the Ti and Ri plasmids ot Agrobacterium spp., particularly the binary type of these 
vectors. Ti-derived vectors transform a wide variety of higher plants, including 
5 monocotyledonous and dicotyledonous plants, such as soybean, cotton and rape 
seed (Pacciotti et al., (1985) Bio/Technolog}> 3:241; Byrne et a!.. (1987) Plant 
Cell. Tissue and Organ Culture 8:3; Sukhapinda et al., (1987) Plant Moi BioL 
8:209-216; Lorzetal., (1985) A^o/. Gen, Genet. 199:178-182; Potrykus et al., 
(1985) A/o/. Gen. Genet. 199:183-188). 
10 Other transformation methods are available to those skilled in the art, such 

as direct uptake of foreign DNA constructs (see EPO pubhcation 0 295 959 A2), 
and techniques of eleciroporation (see Fromm et aL, (1986) Nature (London) 
319:791-793). Once transformed, the cells can be regenerated by those skilled in 
the art. Also relevant are several recently described methods of introducing 
15 nucleic acid fragments into commercially important crops, such as rapeseed (see 
De Block et al., (1989) Plant Physiol. 91 :694-70l), sunflower ( Everett et al., 
(1987) Bio/Technology 5:1201-1204), soybean (McCabe et al.. (1988) 
Bio/Technology^ 6:923-926; Hinchee et al., (1988) Bio/Technology 6:915-922; 
Chee et al., (1989) Plant Physiol. 91:1212-1218: Christou et al., (1989) Proc. 
20 Natl. Acad. Sci USA 86:7500-7504; EPO Publication 0 301 749 A2), and com 
(Gordon-Kamm et al., (1990) Plant Cell 2:603-618: and Fromm et al., (1990) 
Bio/Technology 8:833-839). 

Altered jP-hydroxyphenylpyruvate dioxygenase enzyme activity may also be 
achieved through the generation or identification of modified forms of the isolated 
25 eukar>'otic /7-hydroxyphenylpyruvate dioxygenase coding sequence havmg at least 
one amino acid substitution, addition or deletion which encodes an altered 
/7-hydroxyphenylpyruvate dioxygenase enzyme resistant to a herbicide that 
inhibits the unaltered, naturally occurring form. Genes encoding such enzymes 
can be obtained by numerous strategies known in the art. A first general strategy 
30 involves direct or indirect mutagenesis procedures on microbes (e.g., E. coli, 
S. cerevisiae (Miller, (1972) Experiments in Molecular Genetics, Cold Spring 
Harbor Laboratory, Cold Spring Harbor, NY; Davis et al., (1980) Advanced 
Bacterial Genetics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY; 
Sherman et al., (1983) Methods in Yeast Genetics, Cold Spring Harbor 
35 Laboratory, Gold Spring Harbor NY; and U.S. Patent No. 4,975,374) and 
cyanobacteria (Bryant, The Molecular Biology^ of Cyanobacteria\ Kluwer 
Academic Publishers: Boston, 1995). A second method of obtaining mutant 
herbicide-resistant alleles of the eukaryotic /?-hydroxyphenyipyruvate dioxygenase 
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enzyme involves direct selection in plants. For example, the effect of inhibitors 
on the growth of plants such as Arabidopsis. soybean, or maize may be 
determined by plating seeds sterilized by art-recognized methods on plates on a 
simple minimal salts medium containing increasing concentrations of the 
5 inhibitor. The lowest dose at which significant growth inhibition can be 

reproducibly detected is used for subsequent experiments. Mutagenesis of plant 
material may be utilized to increase the frequency at which resistant alleles occur 
in the selected population. Mutagenized seed material can be derived from a 
variety of sources, including chemical or physical mutagenesis or seeds, or 

] 0 chemical or physical mutagenesis or pollen (Ncuffer, In Maize for Biological 

Research. Sheridan, ed. Univ. Press, Grand Forks, ND.. pp. 61-64 (1982)), which 
is then used to fertilize plants and the resulting Ml mutant seeds collected. 
Typically, for Arabidopsis, M2 seeds (i.e., progeny seeds of plants grown from 
seeds mutagenized with chemicals, such as ethyl methane sulfonate, or with 

1 5 physical agents, such as gamma rays or fast neutrons ) are plated at densities of up 
to 10,000 seeds/plate (10 cm diameter) on minimal salts medium containing an 
appropriate concentration of inhibitor. Seedlings that continue to grow and 
remain green 7-21 days after plating are transplanted to soil and grown to maturity 
and seed set. Progeny of these seeds are tested for resistance to the herbicide. If 

20 the resistance trait is dominant, plants whose seed segregate 3:1 

(resistant:sensitive) are presumed to have been heterozygous for the resistance at 
the M2 generation. Plants that give rise to all resistant seed are presumed to have 
been homozygous for the resistance at the M2 generation. Such mutagenesis on 
intact seeds and screening of their M2 progeny seed can also be carried out on 

25 other species, for instance soybean (see. e.g.. U.S. Patent No. 5.084.082). Mutant 
seeds to be screened for herbicide tolerance can also be obtained as a result ot" 
fertilization with pollen mutagenized by chemical or physical means. 

EXAMPLE 1 
Cloning of a cDNA for Arabidopsis (haliana 

30 p-HydroxvphenvInvruvate Dioxvuenase 

The plasmid containing the Arabidopsis ihaliaua 91B13T7 expressed 
sequence tag (Newman et al., (1994) Plant Physiol 106:1241-1255) was digested 
with the restriction enzymes BamWl and EcoKl. and the resulting 400 bp fragment 
was used to screen a lambda phage cDNA library oi Arabidopsis thaliana 

35 seedlings (Scolnik, P. A. and Bartley, G. E. (1994) Plant Physiol 104:1469-1470) 
according to the following protocol. 

£. coli KW251 cells were grown overnight in Luria Broth {''LB'') containing 
0.2% maltose and 10 mM MgS04. Cells were pelleted by centrifugation and 
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resuspended in 10 mM MgS04 to an OD6qo of 0.5. Cell aliquots (0.8 mL) were 
mixed with 0.1 mL of diluted phage samples and 7 mL of top agarose (0.7% 
agarose in LB containing 10 mM MgS04) at 45''C, and plated onto 1 50 mm Petri 
dishes containing LB agar. Phage plaques became visible in 5-7 h, at which point 
the plates were placed at 4°C. 

Phage plaques were transferred to nitrocellulose filters according Co standard 
techniques, and the filters were hybrized to ^-P-radiolabeled probe prepared 
according to the method of Feinberg and Vogelstein ((1983) AnaL Diochem. 
132:6-13), using the hybridization conditions of Berlyn et al.((1989) Proc. NatL 
Acad. Sci. 86:4604-4608). After exposure to X-ray film for 48 h, 12 positive 
plaques were eluted, plated, and hybridized under the same conditions. A total of 
9 plaques that retained positive signals in this second round of hybridization were 
subjected to in vivo excision using the Exassist/SOLR"^'' system according to the 
manufacturer's protocol (Stratagene Cloning Systems, La Jolla, CA). DNA from 
the plasmids resulting from in vivo excision of positive plaques was prepared for 
DNA sequencing using the Wizard Plus'"''* kit {Promega. Madison, WI). Eight of 
the clones that were sequenced showed strong conservation with available 
/?-hydroxyphenylpyruvate dioxygenase sequences, whereas the remaining clone 
did not correspond to a p-hydroxypheny [pyruvate dioxygenase. Alignment with 
known p-hydroxyphenyipyruvate dioxygenase sequences also revealed that two of 
the clones correspond to 0.3 kbp fragments from the 3' end of the transcript, and 
another two to 1 .2 kbp fragments from the 5' end of the transcript. One clone of 
each was used to assemble a 1 .5 kbp cDNA by ligating at the internal Nhel - 
restriction site (Figure 1). The initial determination of the DNA sequence (SEO 
ID NO:2) of the resulting cDNA clone is shown in Figure 2. Subsequent work 
with this DNA fragment required confirmation of some of the features of its 
sequence. Approximately ten nucleotide residues were found to have been listed 
in error. Thus a corrected sequence for this DNA fragment is listed in SEQ ID 
NO: 1 4 and the deduced amino acid sequence is set forth in SEQ ID NO: 15. The 
revised sequences form the bases for analyses and comparisons reported herein. 



The deduced amino acid sequence for Arahidopsis /7-hydroxyphenyI- 
pyruvate dioxygenase was aligned with the amino acid sequences of 
/7-hydroxyphenylpyruvate dioxygenase from mouse, pig, and Streptomyccs 
avermitilis using the Pileup program of GCG (Program Manual for the Wisconsin 
Package, Version 8, September 1994, Genetics Computer Group, 575 Science 
Drive, Madison, WL USA 5371 1). This analysis suggested an additional 
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Overexnression of the Arabidonsis cDNA in E. coli 
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29 amino acid-exlension at the amino terminus of the Arabidopsis sequence 
(positions 1-29, Figure 3 and SEQ ID NO:3). This amino-terminal extension was 
assumed to be a chloroplast transit peptide which would be absent from the 
mature enzyme. Therefore, removal of the chloroplast transit peptide coding 
5 sequence coincided with transfer of the /?-hydroxyphenylpyruvate dioxygenase 
coding sequence from the cloning vector into the expression vector. 

The Arabidopsis /?-hydroxyphenylpyruvate dioxygenase cDNA was moved 
from the pBluescript SK- cloning vector (Stratagene, La Jolla, CA) to the 
pET24c(+) expression vector (Novagen, Madison. WI) through the intermediate 

1 0 cloning vector pT7BlueR (Novagen). The plasmid pGBPPD2 consists of the 
/Jra^/c/op.s7.y /7-hydroxyphenylpyruvate dioxygenase cDNA and the pBluescript 
SK- cloning vector (Stratagene). The piasmid pE24CPl consists of the 
/iraf?Wo/757*5/7-hydroxyphenylpyruvate dioxygenase cDNA, without the putative 
chloroplast transit peptide DNA sequence, and the pET24c(-t-) expression vector 

1 5 (Novagen). 

The plasmids pGBPPD2 and pT7BlueR (5 |ig each) were individually 
digested with 20 units of Xba I (New England Biolabs, NEB, Beverly, MA) and 
20 units of Hind III (Gibco BRL. Gaithersburg, MD) in NEB restriction enzyme 
buffer 2 supplemented with 100 ^glmL bovine serum albumin at 37 °C for 1 .75 h. 

20 Digesting pGBPPD2 with the restriction enzymes Xba I and Hind III releases the 
5' and 3' ends, respectively, of the /?-hydroxyphenylpyruvate dioxygenase cDNA 
from the pBluescript SK- polylinker. Products of the digestion were electro- 
phoretically separated in a 1 percent agarose gel using TRIS/acetate/EDTA (TAE) 
buffer and visualized with ethidium bromide staining (Maniatis). Digestion of 

25 pGBPPD2 with the two restriction endonucleases resulted in a 2922 bp vector 
band and 1499 bp /7-hydroxyphenylpyruvate dioxygenase cDNA band. Only a 
2863 bp band was apparent after digesting pT7BiueR with the tv/o enzymes, 
although a 24 bp fragment would also result. The 1499 bp /;-hydroxypheny- 
Ipyruvate dioxygenase band and the 2863 bp T7BlueR band were cut out of the 

30 gel and the associated DNA purified from the agarose using a QlAquick Gel 
Extraction Kit (Qiagen, Chatsworth. CA) according to the manufacturer's 
instructions. The purified DNA samples were precipitated by the addition of 
sodium acetate (pH 5.2) to 0.3 M. 10 ng tRNA (added as carrier), two volumes of 
-20 ethanoi and incubation at -20 "^C overnight. Nucleic acid pellets were 

35 collected by centrifugation, washed with 70% ethanoi and air dried. Both pellets 
were solublized in 10 ^xL of TRJS/EDTA (TE) buffer, pH 8 (Maniatis). and then 
1 |aL of each sample loaded onto a 1% agarose. TAE gel in separate wells next to 
a well containing 4 ^iL of Mass Ladder (Gibco BRL). All samples were adjusted 
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to 10 |iL with water before loading. DNA was quantified by comparing band 
intensities of each sample with Mass Ladder band intensities following ethidium 
bromide staining and UV illumination. 

Approximately 300 ng of /?-hydroxyphenyipyruvate dioxygenase insert was 
mixed with 300 ng of double digested pTVBlueR vector in a total volume of 7 ]iL 
and then heated to 45 °C for 5 min followed by cooling on ice. T4 DNA ligase 
buffer (Gibco BRL) and 1 unit of T4 DNA ligase (Gibco BRL) were added to the 
cooled DNA for a total volume of 10 [iL. The ligation mix was incubated at room 
temperature for 4 h and then transformed into MAX Efficiency DH5a Competent 
Cells (Gibco BRL) of coli according to standard procedures (Maniatis). 
Transformed bacteria were spread onto LB agar plates supplemented with 
100 jag/mL carbenicillin and incubated overnight at 37 °C. Seventeen bacterial 
colonies were selected for subsequent analysis. A portion of each colony was 
inoculated into a separate 17x100 mm polypropylene culture tube (Falcon, 
Lincoln Park, NJ) containing 2 mL of liquid LB media and 200 Mg/mL 
carbenicillin. Liquid bacteria cultures were incubated overnight at 37 ''C with 
shaking (250 rpm). Plasmid DNA was then isolated using a QIAprep Spin 
Plasmid Miniprep Kit (Qiagen) according to the manufacturer's instructions. A 
portion (5 jiL out of 50 ^L total) of each plasmid preparation was digested with 
10 units each of Hind III and EcoR V (Gibco BRL) in a total volume of 15 ^L 
with React 2 buffer (Gibco BRL) for one h. (Note: The EcoRV sue in the 
pBluescript polylinker was destroyed during the preparation of pGBPPD2 so only 
the EcoRV site in the pT7BlueR polylinker would be accessible to the restriction 
nuclease). Samples were separated electrophorciically in 1% agarose and 
tris/borate/EDTA (TBE) buffer (Maniatis). Bands were visualized with ethidium 
bromide staining; 7 out of 17 samples which contained 2 bands (2837 and 
1525 bp) contained the /7-hydroxyphenylpyruvate dioxygenase insert and were 
designated pT7BlueR+PD01 (see Figure 4). 

In order to remove the putative chloroplast transit sequence, the remaining 
45 ^L of each prep of pT7BlueR+PD01 were combined into a single sample and 
the DNA content determined spectrophotometrically at A26O (Maniatis). A 
portion (5 \xg) of pT7BlueR-fPD01 was digested with 16 units of Eco47 III (MBl 
Fermenias) in a total volume of 1 00 |iL containing buffer 0 (MBl Fermentas) at 
37 °C for 2 h. The digested plasmid DNA was then precipitated with sodium 
acetate and ethanol as above and the resulting dried nucleic acid pellet was 
dissolved in 60 jiL of React 2 (Gibco BRL) containing 20 units of Nde I (Gibco 
BRL) and incubated 2 h at 37 °C. The double digested sample was then loaded 
onto a 1% agarose gel in TAE and the large 4166 bp Nde I-Eco47III fragment 
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separated from the 196 bp fragment electrophoretically. The large fragment was 
cut out of the gel, purified from agarose and precipitated as above. 

An ohgonucleotide mix was prepared consisting of 100 pmoles each of 
oligos CAM32 and CAM33 (SEQ ID NOS:4 and 5, respectively) in a combined 
5 volume of 9.9 ^L. The two oligos complement each other to form a 3' blunt end 
corresponding to the 5* half of an Eco47 III restriction site and also form a 5' 
staggered end which corresponds to the 3' half of an Nde I restriction site. 

CAM 32: (SEQ ID NO:4) 
10 5'-TATGTCCAAGTTCGTAAGAAAGAATCCAAAGTCTGATAAATTCAAGGTTAAGC-3" 



CAM 33: (SEQ ID NO:5) 

5'-GCTTAACCTTGAATTTATCAGACTTTGGATTCTTTCTTACGAACTTGGACA.3' 

1 5 The oligo mix was heated to 90 °C for 1 .5 min and then allowed to cool to 

room temperature over 20 min. The dried nucleic acid pellet resulting from 
purification of the 4166 bp Nde I-Eco47 III fragment was solublized in 7 f.iL of 
the cooled oligo mix and subsequently heated to 45 °C for 5 min followed by 
cooling on ice. Ligation of the oligos with the Nde I-Eco47 III fragment followed 

20 by transformation into DH5a was performed as above. Transformed bacterial 
cells were spread onto LB/carbenicillin plates and incubated at 37 ''C overnight. 
Seventeen colonies were selected and processed to isolate plasmid DNA as above. 
A portion (5 out of 50 |aL) of each plasmid was double digested with 10 units each 
of Nde I and Hind III and the fragments separated electrophoretically on a 1% 

25 agarose gel in TBE. A two band pattern corresponding to insert (1373 or 1518 bp) 
and vector (2844 bp) was detected. An additional double digest with 10 units 
each of Xba 1 and Hind III was performed on another 5 ^iL aliquot of plasmids. 
When digested with Nde I and Hind III. none of the plasmids which contained the 
smaller insert size contained a Xba I site. The Xba I site would be eliminated if 

30 the two oligos replaced the 1 96 bp fragment originally present m pT7Blue+PD01 . 
The 7 plasmid samples with the modified /?-hydroxyphcnylpyruvate dioxygenase 
insert were combined and designated pT7BlueR+PD02. 

The pT7BlueR+PD02 plasmid DNA was quantified spectrophoiometrically 
(above) and then 5 |ag was digested with 20 units each of Hind III and Nde I in 

35 62 \xL of React 2 for 2 h at 37 '"C. The digested sample was subsequently loaded 
onto a 1% agarose gel in TAE and separated electrophoretically. The 1373 bp 
fragment was isolated and precipitated as above. The plasmid pET24c(-f-) (5 |ag) 
was double digested with 20 units each of both Nde I and Hind III in React 2 at 
37 °C for 2 h and the 5245 bp fragment then gel purified on a 1% agarose gel in 
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TAE and subsequently separated from agarose and precipitated as above. The 
dried pET24c(+) pellet was solublized in 10 nL TE and then 8 uL was adjusted to 
a 20 |iL total volume with water, dephosphorylation buffer (Gibco BRL) and 
1 unit of calf intestinal alkaline phosphatase (Gibco BRL). The sample was 
5 incubated at 37 °C for 30 min and then gel purified, separated from agarose, and 
precipitated as above. The dried, dephosphorylated, pET24cf +) vector pellet and 
modified /7-hydroxyphenylpyruvate dioxygenasc insert pellet were each solublized 
in 10 |iL TE and then 1 jaL of each was run on a 1% agarose TEE gel with 4 |iL of 
mass ladder to quantify DNA as above. One hundred nanograms of modified 
10 /?-hydroxyphenylpytaivate dioxygenase insert was mixed with 120 ng of 

dephosphorylated pET24c(+) vector in a total of 7 |iL volume. The mix was 
heated to 45 for 5 min and then cooled on ice. The mix was then supplemented 
with T4 DNA ligase buffer and 1 unit of T4 DNA ligase in a total volume of 
10 ^L and the mix allowed to incubate at room temperature for 4 h. The ligation 
15 mix was subsequently transformed into DH5a, spread on LB agar supplemented 
with 30 |ig/mL kanamycin, and incubated overnight at 37 ""C. Plasmid 
preparations were performed on 1 I colonies as above. Plasmids were double 
digested with Nde I and Hind III and fragments separated electrophoretically. All 
plasmids had the expected 1 373 bp and 5245 bp fragments. One bacteria colony 
20 was selected and used to inoculate 100 mL of liquid LB supplemented with 

30 |ig/mL kanamycin which was subsequently incubated at 37 °C overnight with 
shaking. Plasmid DNA was isolated from the resulting bacteria culture using a 
Qiagen Plasmid Midi Kit according to the manufacturer's instructions. A portion 
of the plasmid DNA (pE24CPr) was sequenced with the Sequenasc Version 2.0 
25 DNA Sequencing Kit (United States Biochemical. Cleveland, OH) using a 

biotinylated sequencing primer to the T7 promoter (United State Biochemical) 
according to the manufacturer's instructions for non-radioactive manual 
sequencing. DNA was transferred from the sequencing gel to Hybond-N-f- nylon 
transfer membrane (Amersham, Arlington Heights. IL) by capillary action. 
30 Transfer and all subsequent steps in chemiluminescent detection of DNA 

fragments were performed with a SEQ-Light Chemiluminescent Sequencing 
System kit (Tropix, Bedford, MA) according to the manufacturer's instructions. 
DNA sequencing verified that the plasmid contained the expected 5' sequence for 
the modified /?-hydroxyphenylpyruvate dioxygenase insert where nucleotides 1-95 
35 (Figure 2) were replaced with an ATG transcriptional start site. This is equivalent 
to amino acids 2-29 (Figure 3) being eliminated from the N-terminus of the 
Arahidopsis p-hydroxyphenylpyruvate dioxygenase amino acid sequence. 
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The plasmid pE24CPl was transformed into competent cells of BL21(DE3) 
E. coli (Novagen), as above. Transformed cells were spread on LB/kanamycin 
plates and incubated overnight at 37 °C. Seven colonies were selected for plasmid 
preparations as above and plasmid DNA was double digested with Nde I and 
Hind III to verify that all plasmids had the expected electrophoretic banding 
pattern. One colony was selected and streaked for isolation on LB/kanamycin 
plates. A well isolated colony was used to inoculate liquid LB supplemented with 
30 ^g/mL kanamycin and the culture was incubated at 37 °C with shaking 
(250 rpm) until it reached an A^qo 0-6 absorbancc units. An 8% glycerol 
freezer stock was prepared according to the Novagen protocol and stored at 
-80 °C. All subsequent expression studies were done with freshly grown bacterial 
cells that were isolated from LB/kanamycin plates streaked from the glycerol 
freezer stock. 

BL21(DE31 £. coli cells containing either PE24CP1 or pET24c(-) (negative 
control) were streaked out onto LB/kanamycin plates from a glycerol freezer stock 
(above) and incubated overnight at 37 °C. One isolated colony was selected for 
inoculation of 2 mL of LB containing 30 ^g/mL kanamycin in a 17 x 100 mm 
Falcon tube, and the culture was incubated at 37 with shaking (250 rpm) 
overnight. The overnight cultures were then used to inoculate 100 mL of fresh LB 
containing 30 |ag/mL kanamycin. The new cultures were incubated at 37 with 
shakmg until the A500 reached between 0.4 and 0.6 absorbancc units. One half of 
the pE24CPl and pET24c(+) cultures were placed in new culture flasks and IPTG 
(isopropylthio-P-D-galactoside: Gibco BRL) w^as added to the new flasks to give a 
final concentration of 1 mM. The flasks were incubated an additional 3 h at 37 °C 
with shaking, and then the cells were harvested. 

The harvested cells were centrifuged and the resulting cell pellet extracted 
by sonication (3x10 sec bursts) in 2 mL extraction buffer (50 mM (20 mM m the 
first experiment; Table 2) potassium phosphate buffer, pH 7.2, containing 0.14 M 
KCl, 0.32 mM reduced glutathione, 1% polyvinylpolypyrrolidone. and 0.1% 
Triton X 100 (0.01% lysozyme was included in the first experiment only)). The 
lysate represents the crude extracted enzyme after ccntrifagation at 17000 g for 
10 min. In the first experiment (Table 2) a 20 to 60% ammonium sulfate 
precipitated enzyme fraction was also assayed. Solid ammonium sulfate was 
slowly added with stirring to 2 mL of the lysate to bring the concentration to 20% 
(w/v). After incubation on ice for approximately 15 mini the solution w^as 
centrifuged at 17000 g for 10 min. The supernatant liquid was harvested and solid 
ammonium sulfate was added to increase the concentration to 60%o (w/v). After 
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centrifugation, the resulting pellet was resuspended in 1 mL of the extraction 
buffer. 

A portion of the insoluble protein resulting from expression of Arabidopsis 
/7-hydroxyphenylpyruvate dioxygenase in bacteria was utilized for N-terminal 
5 sequence analysis. The protein (approximately 180 \ig) was suspended in 60 
of extraction buffer and then diluted with 5 volumes of sample buffer (62.5 mM 
Tris, pH 6.8, 6 M urea, 160 mM dithiothreitol, 0.01% bromophenoi blue) 
followed by intermittent vortexing for one hour at room temperature. A 1.5 mm 
thick, 12% polyacrylamide resolving gel was prepared for a Mini-Protein II dual 
10 slab cell (Bio-Rad, Hercules, CA) using the manufacturer's instructions. The 
polyacr>'lamide was allowed to polymerize for 3 h and then a stacking gel was 
prepared using a preparative comb. The running buffer was prepared according to 
the manufacturer's instructions with the addition of O.I mM sodium thioglycolate. 
The solublized protein sample was electrophoretically separated using the 
15 manufacturer's instructions. When the bromophenoi blue dye front reached the 
bottom of the gel, the gel was removed and equilibrated for 5 min in blotting 
buffer ( 1 0 mM CAPS, pH 11 , 1 0% methanol, balance water). The gel was then 
placed in a Mini Trans-Blot Electrophoretic Transfer Cell (Bio-Rad), according to 
the manufacturer's instructions, with a ProBlott PVDF membrane (Applied 

20 Biosystems, Foster City. CA) treated according to the manufacturer's instruction. 
Electroblotting was done in the presence of blotting buffer at 50 volts for 45 min 
in an ice bath. The membrane was then rinsed in water and stained with 
Coomassie Blue as described in the ProBlott protocol. The major protein band 
was excised from the membrane and subjected to N-terminal amino acid 

25 sequencing on a Beckman (Fullerton. CA) LF3000 protein sequencer. The first 
1 1 cycles identified S-K-F-V-R-K-N-P-K-S-D (see SEQ ID NO:3, amino acids 
30-40). respectively. This is the expected N-terminus of the modified Arabidopsis 
/7-hydroxyphcnylpyruvate dioxygenase minus the initial methionine (amino acids 
30-40, Figure 3). 

30 EXAMPLE 3 

p-Hvdroxvphenvlpvruvate Dioxv^enase Enzvmatic Activitv 
of the Plant Protein Expressed in E. Coli 
Cell cultures with different plasmid constructs were extracted as described 
above and assayed by measuring the formation of ''*C02 from 

35 [l-^'^CJ-p-hydroxyphenylpyruvate or ^'^COt and '"^C-homogentisate from 
[U-^4c]-/?-hydroxypheny [pyruvate (Lindblad, B., (1971) Clin. Chim. Acta 
34:1 13-121; and Lindstedt, S. and Odelhog, B., (1987) Methods in Enzymolo^ 
142:143-148). The labeled substrate was prepared from [l-'''C]-L-tyrosine 
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(55 mCi/mmol; American Radiolabeled Chemicals, Inc., St. Louis. MO) or 
[U-l'^Cj-L-tyrosine (498 mCi/mmol; DuPont NEN, Boston, MA). A 50-100 uL 
aliquot (5-10 laCi) of the of the labeled tyrosine stock solution was transferred to a 
4 mL glass vial and blown to dryness in a stream of nitrogen at 45°C. To the vial 
was added 175 f^L of 0.1 M phosphate buffer, pH 6.5. 5 \iL catalase (28,700 units 
of C-100, Sigma Chemical Co., St. Louis. MO), and 20 nL L-amino acid oxidase 
(Sigma A-9253, 6.5 units/mL). The vial was then placed on a shaker water bath 
set at 30°C, 60 cycles/min, for 0,5 to 1 h. The reaction mix was then passed 
through a small column containing 400 |iL Dowex AG 50W X8 cation exchange 
resin. The column was then washed with 1.5 mL of water and the eluant 
containing the labeled /?-hydroxyphenylpyruvate was collected. The labeled 
substrate was either used immediately or stored at -80°C and used within a week 
after preparation. 

The assay was performed in 14 mL culture tubes capped with serum 
stoppers tlirough which a polypropylene well containing 200 \aL of 1 N KOH was 
suspended. The reaction mixture contained 5.740 units of catalase. 100 |iL of a 
freshly prepared 1 : 1 (v:v) mixture of 150 mM reduced glutathione and 3 mM 
dichlorophenolindophenoL 5 mM ascorbate, 0.1 mM ferrous sulfate (the ascorbatc 
and ferrous sulfate were not present in the buffer used in the first experiment; 
Table 2), 50 |iM unlabeled p-hydroxyphcnylpyruvatc, 1 -25 |aL of the enzy^me 
extract, and 50 mM potassium phosphate buffer in a tlnal volume of 980 \xL. 
Unlabeled substrate was made fresh daily in 50 mM potassium phosphate buffer 
and allowed to equilibrate for at least 2 h at room temperature to insure that 
greater than 95% was in the keto form. The tubes were incubated for 1 0 min at 
30°C in a shaking water bath prior to adding 20 |.iL (0.04 |.iCi) of 
I'^C-p-hydroxyphenylpyruvatc. The reaction was terminated after 60 min by 
injecting 500 ^1 of 1 N sulfuric acid through the serum stopper. The vials were 
left on the shaker for another 30 mm to insure complete capture of the released 
'"^COt. The serum caps were then removed and the wells cut and dropped into 
8 mL scintillation vials. Six mL of FonTiula-989 scintillation fluid (Packard 
Inslurments, Meriden, CT) was added to the vials and the ^"^C radioactivity was 
determined by scintillation counting. Table 2 summarizes the results of this 
experiment. 
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Table 2 

p-Hydroxyphenylpyruvate Dioxygenasc Activity of Extracts from 
E. coll Containing Different Plasmid Constructs 



Plasmid 


Inducer 
(1 mM IPTG) 


Lysate 


Ammonium Sulfate Precipitate 


dpm * /ma 


nmol/min x mu 


dpm * /me 


nmol/min \ mti 


pET24c(+) 




12,318 


0.09 


0 


0.00 


pET24c(+) 


4- 


35,115 


0.25 


3.393 


0.03 


pE24CPI 




24,607 


0.17 


126.761 


0.89 


pE24CPI 




243,801 


1.71 


IJ71,823 


9.64 



^"^C : = 1 : 50; sp. act. of ^"^C-z^-hydroxyphenylpyruvate = 55 mCi/mmol 

5 

The results show there was little or no ;?-hydroxyphcnyipyruvate 
dioxygenase activity in any of the ceil cultures that did not have the plasmid 
containing the nucleic acid fragment encoding /^-hydroxyphenylpyruvaie 
dioxygenase (pET24c(+)) and the inducer of gene expression (IPTG). The gene 

10 and inducer together resulted in a marked increase in activity. 

In the experiment with [U-*^C] p-hydroxyphenylpyruvate (^^HPPA"). where 
both ''^C02 and '"^C-homogcntisic acid were measured, the reaction was initiated 
by adding 50 |aL of labeled substrate (0.3 \xC\) and was terminated with 100 ^L of 
10% phosphoric acid. The '"^CO^ released was determined by scintillation 

1 5 counting, while the level of homogenlisic acid was determined by HPLC on a 
Zorbax RX-C8 column (4,6 \ 250 mm) with an in-line radioactivity detector. 
Aliquots of 1 .7 to 1 5 |aL were taken from the reaction mix after ccntrifugation and 
diluted into the column equilibration buffer prior to injection. Separation was 
performed at ambient temperature with a flow rate of 1 .0 mL/min and the 

20 following gradient with solvent A and B being water and methanol, each with 

phosphoric acid: 0-2 min, isocratic at 95% A and 5% B; 2-1 7 min. linear gradient 
from 95 to 75% A and 5 to 25% B; 17-19 min linear gradient from 75 to 5% A 
and 25 to 95% B; 19-22 min. isocratic at 5% A and 95% B; 22-24 min, linear 
gradient from 5% to 95% A and 95 to 5% B. In this system homogcntisate eluted 

25 at 10.8 min. The results from this experiment are shown in Table 3. 
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Table 3 

p-Hydroxyphenylpyruvate Dioxygenase Activity of Cell Extracts 
Determined by CO^Release and Homogcntisic Acid Synthesis 

from [U-^'^C]/7-Hydroxyphenylpyruvate 

5 ' 





Inducer 


nmol/min x mg* 


Plasmid 


(! mM IPTG) 




Homocentisic acid 


pET24c(+) 




0-00 


0.00 


pET24c(+) 




0.19 


0.00 


pE24CPl 




4.68 


4.76 


pE24CP! 




29.12 


29.82 


• I4c : I2c 


- I ; 87.7: sp. act. 


of *^C[U]-/?-l 


[1PPA = 498 mCi /mmol 



There was a tight correlation between the results from the assays of the two 
products of the reaction. The results confirmed there was no significant 

1 0 p-hydroxyphenylpyruvatc dioxygenase activity in either cell culture that did not 
contain the nucleic acid fragment encoding p-hydroxyphenylpyruvate 
dioxygenase. There was mcasureable enzyme activity in the absence of the 
inducer, but when the inducer was added the activity increased greater than six- 
fold over uninduced cultures. These results and those of Table 2 clearly show that 

15 the nucleic acid fragment isolated and overexpressed in E. coli cells encodes a 
protein that catalyzes the conversion of p-hydroxyphenylpyruvate to 
homogentisate with the release of CO2. 

The overexpressed protein wais also assayed spcctrophotometrically at 
ambient temperature using the cnol borate-tautomerase assay (Lin, I:. C. C. et al., 

20 (1958)./ Biol. Chem. 233:668-67:0- The assay buffer contained 0.4 M borate 

(adjusted to pH 7.2 with 0.2 M sodium borate), 4 mM ascorbate. 2.5 mM EDTA, 
40 \xM /?-hydroxyphenylpyruvale, and 0.5 units of tautomerase (Sigma T-6004) 
per 10 mL buffer. The reaction mix was used when the tautomerization of the 
substrate was complete (when absorbance at 308 nm had stabilized). The assay 

25 was initiated by adding 40 \xL of the cell extracts to 960 \iL of the assay buffer, 

and the reaction was followed by measuring the decrease in absorbance at 308 nm. 
Table 4 summarizes the results with extracts of the same four cell cultures 
described in Table 3. 



wo 97/49816 PCT/US97/11295 

Table 4 

Spectrophotometric Assay of /7-Hydroxyphenylpyruvate 
Dioxygenase Activity of Cell Extracts 



25 



Plasmid 


Inducer 
(1 mM FPTG) 


nmol ^-HP lost/min x msr* 


pET24c(+) 




1.58 


pET24c(+) 




2.73 


pt24CPI 




4.91 


pE24CPI 




22.32 



Loss of77-hydroxyphenylpyruvate based on a molar extinction 
coefficient for the equilibrium mixture of 9850 as reponed by 

Lin et al. ((1958)7. Biol. Chem. 233: 668-673). 



EXAMPLE 4 

1 ^ Inhibition of ;?-Hvdroxvphenvlnvmvate Dioxvgenase hv Commercial Herbicides 
The enzymatic activity of the overexpressed protein is inhibited by two 
herbicides known to inhibit plant /7-hydrox>'phenylpyriivate dioxygenase: 
Sulcotrione (2-f2-chloro-4-methanesulfonylbenzoyl)- 1 .3-cyclohexanedione); and 
Isoxaflutole (5-cyclopropyhsoxazol-4-yI 2-mesyl-4-trifluoromethylphenyl 

I 5 ketone). These two compounds were tested against the overexpressed protein 
using both the ''*C02 and the continuous spectrophotometric enol borate- 
tautomerase assays. Both compounds were added to the assay buffers in 10 |iL of 
acetone or dimethyl sulfoxide. The I5Q values (concentration inhibiting the 
enzyme 50%) were calculated based on the percent inhibition observed over 

20 several concentrations of the inhibitor. The results of the assays are shown in 
Table 5. 



Table 5 

I50 Values of Inhibitors of Plant /7-Hydroxyphcnylpyruvate Dioxygenase 





I50 value (nM) derived from 


Compound 


*^CO-> assav spectrophotometric assay 


sulcotrione 


43 44 


isoxaflutole 


409 1042 



These results clearly show that the p-hydroxyphenylpyruvate dioxygenase 
activity of the overexpressed protein is inhibited by commercial herbicides that 
have inhibition of this enzyme as their mode of action. Moreover, the continuous 
30 spectrophotometric assay gave similar I5Q values to those obtained with the '"^CO-* 
assay. The spectrophotometric assay can be adapted to a high capacity screen for 
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inhibitors of /?-hydroxyphenylpyruvate dioxygenase by adapting it to a microtiter 
plate assay combined with a plate reader that would read at or near 308 nm. 
Furthermore, any colorimetric or fluorescent assay for homogemisaie or 
jr?-hydroxypheny [pyruvate would also be able to be readily adapted into a high 
5 capacity screen for inhibitors of this enzyme. The isolated overexpressed enzyme 
has sufficient activity to be used directly in a spectrophotometric assay or it can be 
further purified for enhanced assay sensitivity. 

EXAMPLE 5 

Re-construction of the Full-length p-Hvdroxvphenvlpvruvate Dioxvaenase Gene 

1 0 for Production of Active. Stable Enzvme in Bacteria 

The plasmid pT7BlueR+PD02, described in Example 2 and containing the 
full-length /5-hydroxyphenylpyruvate dioxygenase gene, proved to have incorrect 
sequence at the EcoRl site. This was re-sequenced so that an oligonucleotide 
could be designed to replace the EcoRJ site with an Ndel site using conventional 

I 5 loop-out mutagenesis. The oligonucleotide was designed so that this procedure 
also introduced an ATG initiation codon at the 5'- end of the /;'-hydroxyphenyl- 
pyruvatc dioxygenase gene followed by the full-length /;-hydroxyphcnylpyruvate 
dioxygenase sequence. After mutagenesis, the clone was amplified in E. coll and 
the plasmid was purified. The resulting full-length gene, 'TDO-B'\ was then 

20 digested with the enzymes using Ndel and Nhel, and the -820 bp fragment used to 
replace the Ndel - Nhc I segment of the truncated /:^-hydroxyphenylpyruvate 
dioxygenase gene, "PDO-A," in pE24CPl (Example 1). The resulting plasmid, 
pE24PDO-B can be expressed in bacteria to produce the full-length Arahidopsis 
/>hydroxyphenylpyruvate dioxygenase enzyme as determined by enzyme activity 

25 and N-terminal sequence analysis. 

EXAMPLE 6 

Enhanced Stability of Full Leneth Construct Over the Truncated Construct 
Two different constructs fox Arabidopsis (haliana /7-hydroxyphenyl- 
pyruvate dioxygenase, one containing the full-length sequence. PDO-B as 

30 described in Example 5 and produced from plasmid pE24PDO-B, and one 
containing the truncated sequence lacking the putative chloroplast leader 
sequence. PDO-A as produced from plasmid pE24CPL were both purified to the 
same extent using a Pharmacia phenyl Sepharose hydrophobic interaction column 
followed by gel filtration chromatography on Pharmacia Sephacr>'l 300. The two 

35 proteins were diluted to 1 mg/mL in 20 mM bis tris-propane buffer, pH 7.2 
containing 5 mM ascorbate, 1 mM reduced glutathione and 0.1 niM ferrous 
ammonium sulfate and stored in a refrigerator at 4 °C for up to 10 days. Aliquots 
were removed at various times and assayed for activity using the tautomerase 
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coupled spectrophotomctric assay. Under these conditions the half-life for the 
activity of the full length enzyme was 4 days, whereas the truncated enzyme 
preparation had a half-life of 9 to 10 hours. In addition, the activity of the full 
length enzyme could be restored by incubation with iron and reducing agent. 
5 reduced glutathione or ascorbate, or by dialysis against buffer containing iron and 
reducing agent. In contrast, the activity of the truncated enzyme could not be 
restored by incubation with or dialysis against buffer containing iron and reducing 
agent. The full-length enzyme was also more stable in the spectrophotomctric 
assay showing a 2 to 3 times longer useful linear region than the truncated 

10 enzyme. Both enzyme preparations showed similar I5Q values with the 
herbicidally active inhibitors. 

These results clearly show that the full-length PDO-B construct has 
decided advantages over the truncated enzyme due to the enhanced stability under 
storage conditions, in the spectrophotomctric assay and in the reversible 

I 5 reconstitution of activity in the presence of iron and reducing agent. While both 
enzyme constructs can be used for screening of inhibitors, the PDO-B enzyme is 
preferred for this application and is far superior for mechanistic and structural 
studies. 

EXAMPLE 7 

20 Cloninu of the Maize |r)-Hvdroxvphenvlpvaivatc Dioxvuenase Gene 

Approximately 600.000 plaques of a Stratagene maize Uni-Zap cDNA 
library (from young plants) were screened by filter hybridization under moderate 
stringency using a heterologous probe. The probe was prepared by PCR and was 
a 916 bp fragment of DNA having the sequence defined by the region extending 

25 from position 263 to 11 78 of SEQ ID NO: 1 4. Twenty-four positive phage clones 
were identified in the primary screen, and eleven phage clones were recovered 
from a secondary screen. Seven positive clones were submitted for sequencing, 
and four showed significant conservation sequence at the amino acid level when 
compared with the Arabidopsis thaliana /7-hydroxyphenylpyruvate dioxygenase 

30 protein. The longest of the four contained an insert of 988 bp and showed 70% 
identity and 78% similarity with the Arabidopsis protein, but was lacking 
approximately 550 bp corresponding to the amino terminal end of the protein. 

Attempts to obtain a full-length cDNA of the maize /?-hydroxyphenyl- 
pyruvate dioxygenase gene were unsuccessful, possibly because the secondary 

35 structure of the RNA inhibited efficient reverse transcription of this transcript. 

Two additional cDNA libraries were screened and clones long enough to contain a 
full-length cDNA were sequenced. All of these clones were shown to be 
chimeras. Therefore a genomic library was screened to obtain the 5' one-third of 
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the gene. Approximately 1 million clones from a Clontech Zea mays (var. B73) 
library in the phage vector EMBL3 (whole seedlings, 2 leaf stage) were screened 
using a 415 bp EcoFU-BssHII fragment containing the 5' end of the truncated corn 
/?-hydroxyphenylpyruvate dioxygenase cDNA (clone HlOllC). Eight positive 
primary phage clones were plated and screened, and four secondary clones were 
picked. DNA was prepared from each using the Qiagen Lambda midi-kit. 
Restriction digests with Sail or EcoRJ indicated that two clones were the same. 
DNA samples from the remaining 3 clones (11. 1.3, 13.1.1, and 21.2.1) were 
digested with Sail, EcoRI, or Sail and EcoRI, prepared for Southern analysis, and 
probed with the full length y4raA/c/o/7,v/A /7-hydroxypheny [pyruvate dioxygenase 
gene. Two of the clones (II .1 .3 and 13.1.1) showed sequence conservation, and 
these homologous fragments were subcloned and sequenced. Both clones 
appeared to contain the full-length gene and each contained one intron near the 3' 
end of the gene. However, there were differences between the sequences of the 
two clones indicating that they may be two different genes or one may be a 
pseudogene. The sequence of clone 1 1 . 1 .3 matched the cDNA sequence, and this 
clone was used to construct a full length /;-hydroxyphenylpyruvate dioxygenase 
coding region. 

The gene was contained on two adjacent fragments, a 3.5 kb EcoRI - Sail 
fragment and a 2 kb Sail fragment. Both were subcloned into pBluescript SKII+ 
resulting in the plasmids pESl 1 13 and pSall 1113. pESl 1 13 was digested with 
Spel to release approximately 2.7 kb of upstream sequence and then religated, 
resulting in a plasmid with an insert of 747 base pairs (pSPEl). pSPEl was 
digested with Sail to linearize the plasmid and ligated w^ith the 2 kb Sail fragment 
from pSal 1113, which had been released by digestion with Sail and gel purified. 
Orientation was confirmed by digestion with Spel and Bpul 1021 and the correct 
plasmid was named pi 1 13. In order to remov^e the intron contained in the 3' end 
of the genomic clone, the plasmid was digested with Bpul 1021 and Xhol and the 
3.9 kb fragment containing the vector and 5' part of the gene was gel purified. 
The corresponding 882 bp Bpul 1021-XhoI fragment from pHlOl Ic (cDNA)was 
gel purified and ligated with this 3.9 kb fragment resulting in the clone pMPDO 
(ATCC 209120), which contains a 1782 bp insert. There are 260 base pairs 
upstream of the putative ATG and 189 base pairs downstream of the stop codon. 
The full-length sequence was confirmed by sequencing across the insert. The 
nucleic acid sequence and the deduced protein sequence for com 
p-hydroxyphenylpyruvate dioxygenase are presented in SEQ ID NOS: 10 and 1 1, 
respectively. The sequences for /?-hydroxyphenylpyruvate dioxygenases obtained 
from com and Arahidopsis were compared using the "Gap" program of GCG 
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(Program Manual for the Wisconsin Package, Version 9.0-OpenVMS, December 
1996, Genetics Computer Group. 575 Science Drive. Madison. WI, USA 5371 1). 
The results of these comparisons indicated that these functions are approximately 
67% identical at the nucleotide level, and they possess 69% similarity and 62% 
5 identity at the amino acid level. The predicted amino acid sequence of com 
p-hydroxyphenylpyruvate dioxygenase is compared with that from Arabidopsis 
and other eukaryotes in Figure 3. 

EXAMPLE 8 

Composit ion of a cDNA Library; Isolation and Sequencing of cDNA Clones 
^ 0 A cDNA library representing mRN As from developing seeds of Vernonia 

galamenensis that had just begun production of vemolic acid was prepared. The 
library was prepared in a Uni-ZAP™ XR vector according to the manufacturer's 
protocol fStratagene Cloning Systems, La Jolla, CA). Conversion of the 
Uni-ZAP*^^ XR library into a plasmid library was accomplished according to the 
1 5 protocol provided by Stratagene. Upon conversion. cDNA inserts were contained 
in the plasmid vector pBluescript. cDNA inserts from randomly picked bacterial 
colonies containing recombinant pBluescript plasmids were amplified via 
polymerase chain reaction using primers specific for vector sequences flanking 
the insened cDNA sequences. Amplified insert DNAs were sequenced in dye- 
20 primer sequencing reactions to generate panial cDNA sequences (expressed 

sequence tags or "ESTs"; see Adams, M, D. et ai., {\99\) Science 252:1651). The 
resulting ESTs were analyzed using a Perkin Elmer Model 377 fluorescent 
sequencer. 

EXAMPLE 9 

25 Identification and Characterization oFcDNA Clones 

ESTs encoding Ver yiofiid ^cdamenensis enzvmes were identified bv 
conducting BLAST (Basic Local Alignment Search Tool: AltschuL S. F. et al.. 
(1993) J. Mol. BioL 215:403-410; see alsovwvw.ncbi.nlm.nih.gov/BLAST/) 
searches for similarity to sequences contained in the BLAST ^^nr" database 

30 (comprising all non-redundant GenBank CDS translations, sequences derived 
from the 3-dimensional structure Brookhaven Protein Data Bank, the last major 
release of the SWISS-PROT protein sequence database, EMBL. and DDBJ 
databases). The cDNA sequences obtained in Example 9 were analyzed for 
similarity to all publicly available DNA sequences contained in the "nr" database 

35 using the BLASTN algorithm provided by the National Center for Biotechnology 
Information (NCBl). The DNA sequences were translated in all reading frames 
and compared for similarity to all publicly available protein sequences contained 
in the "nf ' database using the BLASTX algorithm (Gish. W. and States, D. J. 
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(1993) Nature Genetics 3:266-272) provided by the NCBI. For convenience, the 
P-value (probability) of observing a match of a cDNA sequence to a sequence 
contained in the searched databases merely by chance as calculated by BLAST are 
reported herein as "pLog'" values, which represent the negative of the logaritlim of 
5 the reported P-value. Accordingly, the greater the pLog value, the greater the 
likelihood that the cDNA sequence and the BLAST "hit"' represent homologous 
proteins. 

The BLASTX search using clone vsl .pk0015 .b2 revealed similarity of the 
protein encoded by the cDNA to a number of /J-hydroxyphenylpyruvate 

10 dioxygenases from sources other that plants, fhc three most similar p-hydroxy- 
phenylpyruvate dioxygenase proteins were a .streptomycete /7-hydroxyphenyi- 
pyruvate dioxygenase (GenBank Accession No. Ul 1864; pLog = 8.34), a rat 
p-hydroxyphenylpyruvatc dioxygenase (GenBank Accession No. Ml 8405; 
pLog = 7.66), and a human p-hydroxyphenylpyruvatc dioxygena-se (GenBank 

1 5 Accession No. U29895: pLog = 7.60). SEQ ID NO: 1 6 shows the nucleotide 
sequence of a portion of the Vernonia galamenensi.s cDNA in clone 
vsl.pk0015.b2. Sequence alignments and BLAST scores and probabilities 
indicate that the instant nucleic acid fragment encodes a portion of Vernonia 
galamenensis p-hydroxypheny Ipyruvate dioxygenase. 
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SEQUENCE LISTING 

(I) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: E. I. DUPCNT DE NEMOURS AND COMPANY 

(3) STREET: 1007 MA.RKET STREET 

(C) CITY: WILMINGTON 

(D) STATE: DELAWARE 

(E) COUNTRY: U.S.A. 

(F) POSTAL CODE {ZIV: : 193 93 

(G) TELEPHONE: ::02-B92-S 1 12 

(H) TELEFAX: 3 02-7 7 3-016'! 
(T) TELEX: 6717325 

(ii) TITLE OF INVENTION: PLANT GENE FOR p-HYDROXY- 

P r. E NY L ? Y RU VAT E D I OX YG E N AS Z 

(iii) NUMBER OF SEQUENCES: ib 

(ivj COMPUTER READABLE: FORM: 

(A) MEDIUM TYPE: DISKETTE, ,i . ~0 INCH 

(B) COMPUTER: IBM PC COMPATIBLE 

(C) OFEFATINC SYSTEM: MICROSOFT W0R:3 FOP. WINDOWS 
fD} SOFTWARE; MICROSOFT WORD VERSION 7 . OA 

(v) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

fvi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 60/021, SG-'i 

(B) FILING DATE: TUNE 27, 1996 

(vii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: FLOYD, LINDA A>:AMETHY 
;B; REGISTRATION NUMBER: 33,692 
iC) REFERENCE/ DOCKET NUMBER: BA-9120 
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(2) IMFORM^^.TICN FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 233 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : Single 
{0} TOPOLOGY: linear 

(ii) KQLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

CAAGAAACGN GTCGNCGACG TGCTCAGCGA TGATCAGATC AAGGAGTGTG AGGA.HTTAGG CO 

GATTCTTNTA GACAGAGATG ATCAAGGGAC GTTNCTTC7^J\ ATCTKCACAA Ar\CCACTAGG 120 

TGACAGGCCG ACGNTATTTA TAGAGAT.AAT CCAGAGNGTA GGATGCATGA TGAAA.GATGT l?.C, 

GGrJ\GGGANG GCTTACCAGA GTGGAGNATN TNGTGGTTTT G^ICAAJaGGCA ATT 2 33 

(2) INFORMATIOH FOR SEQ ID NO : 2 : 

;i) SEQUENCE CHARACTERISTICS: 

:a; LENGTH: 1448 base pairs 
vB) TYPE: nucleic acic 
:C) STRANDEDNE5S : smqle 
;D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(ix) FEATURE: 

;A) NAME /KEY: CDS 

(D) LOCATION: 9. . 1343 

(XI) SEQUENCE DESCRIPTION: SEQ : C- MO : 2 : 

TGAAATCA ATG GGC CAC CAJ\ AAC GCC GCC CTT TCA GAG AAT CAJi AAC CAT ^0 
Met Gly His Gin Asn Ala Ala Val Ser Glu Asn Gin Asn Mrs 
15 10 . 

GAT GAC GGC GCT GCG TCG TCG CCG GGA TTC AAG CTC GTC GGA TTT TCC 93 
Asp Asp Giv Ala Ala Ser Ser Pro Gly Phe Lys Lou Vai Gly P Ser 
IS 2 G 2 5 >^ 0 

.AAG TTC CTA AGA ,^iJ^G AA.T CCA AAG TCT GAT AAfA TTC /XAG GTT AA.G CGC 14 G 

Lys Phe Val Ara Lys Asn Pro Lys Ser Asp Lys P;vj I-VG Val Lys Arg 
3 C- 4 0 4 5 

TTC CAT CAC ATC GAG TTC TGG TGC GGG GAC GCA ACC .A.AC GTC GCT CGT 154 
Phe His His He Giu ?hc Trp Cvs Glv Asp Ala Thr Asn Val Ala Arg 
50 55 60 

CGC TTC TCC TGG GGT CTG GGG ATG AGA TTC TCC GCC ArW TCC GAT CTT 2 42 

Arg Phe Ser Trp Giv Leu Gly Met Arg Phie Ser Ala Lys Ser Asp Leu 
65 ' 70 75 

TCC ACC GGA AAC ATG GTT CAC GCC TCT TAC CTA CTC ACC TCC GGT GAA 2 90 

Ser Thr Glv Asn Met Val His Ala Ser Tyr Leu Leu Thr Ser Gly Glu 
80 ' 85 90 

CTC CGA TTC CTT TTC ACT GCT CCT TAC TCT CCG TCT CTC TCC GGC GGA 538 
Leu Ara Phe Leu Phe Thr Ala Pro Tyr Ser Pro ^ior L'^u Ser Gly Gly 
95 ' 100 105 110 
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GAG ATT AAA CCG ACA ACC ACA GGT TCT 
Glu He Lys Pro Thr Thr Thr Giy Ser 
115 




PCTAJS97/11295 

ATC CCA AGT TTC GAT CAC GGG 38 6 

lie Pro Ser Phe Asp His Gly 
120 125 



TCT TGT CGG 
Ser Cy3 Arq 



GCG ATT GAA 

Aia He Glu 
145 

AAT GGC GCT 

Asn Glv Aia 
160 

ACG ATC GCT 

Thr He Aia 
175 

AGT TAG AAA 

Ser Tyr Lys 



TCC TTC TTC 
Ser Phe Phe 
130 

GTA GAA GAC 
Vai Giu Asp 



ATT CCT TCG 
lie Pro Ser 



GAG GTT AAA 
Glu Val Lys 
180 

GCA GP-J-, GAT 
Ala Glu Asp 
195 



TCT TCA CAT 
Scr Ser His 
135 

GCG GAG TCA 
Ala Glu Ser 
150 

TCG CCT CCT 
Ser Pro Pro 
165 

CTA TAG GGC 
Leu Tyr Gly 



ACC GAA AAA 
Thr Glu Lys 



GGT CTC GGT 
Gly Leu Gly 



GCT TTC TCC 
Aia Phe Ser 



ATC GTC CTC 
lie Vai Leu 
170 

GAT GTT GTT 
Asp Val Val 
185 

TCC GAA TTC 
Ser Giu Pne 
200 



GTT AGA CCC 
Vai Arg Pro 
140 

ATC AGT GTA 
lie Ser Vai 
155 

AAT GAA GCA 
Asn Giu Aia 



CTC CGA TAT 
Leu Arg Tyr 



TTG CCA GGG 
Leu Pro Glv 
205 



GTT 434 
Val 



GCT 4 82 

Ala 



GTT 530 
Vai 



GTT 578 

Val 

190 

TTC 626 
Phe 



GAG CGT GTA GAG GAT GCG TCG TCG TTC CCA TTG GAT TAT GGT ATC CGG 67^; 
Glu .Arg Vai Glu Asp Aia Ser Ser Pho Fro Leu Asp Tvr Giv He Arc 
210 215 220 



CGG CTT GAC CAC GCG GTG GGA AAC GTT CCT GAG CTT GGT CCG GCT TTA 7 22 

Arg Leu Asp His Ala Vai Gly Asa Val Pro Glu Leu Giv Pro Aia Leu 
225 230 235 



ACT TAT GTA GCG GGG TTC ACT GGT TTT CAl^ CAA TTC GCA GAG TTC ACA 7 70 

Thr Tyr Val Ala Gly Phe Thr Gly Phe H^s Gin Phe Aia Glu Phe Thr 

240' 245 250 

GCA GAC GAC GTT GGA ACC GCC GAG AGC GGT TTA /vAT TCA GCG GTC CTG 810 

Ala Asp Asp Val Gly Thr Ala Giu Ser Giy Leu Asn Ser Aia Val Leu 

255 260 265 270 

GCT AGC AAT GAT GAA ATG GTT CTT CTA CCG ATT AAC GAG CCA GTG CAC 8 66 

Ala Ser Asn Asp Glu Met Vai Lou Leu Pro He Asn GLu Pro Val His 

275 2S0 285 



GGA ACA AAG AGG AAG AGT GAG ATT CAG ACG TAT TTG GAA CAT AAC GA/A 914 

Gly Thr Lys Arg Lys Ser Gin lie Gin Thr Tyr Leu Giu His Asn Giu 

290 295 300 

GGC GCA GGG C7A CAA CAT CTG GCT CTG ATG AGT GAA GAC ATA TTC AGG 962 

Gly Ala Gly Leu Gin His Leu Aia Leu Met Ser Glu Asp He Phe Arg 

305 310 315 

ACC CTG AGA GAG ATG AGG AJ\G AGG AGC AGT ATT GGA GGA TTC GAC TTC 1010 

Thr Leu Arg Glu Met Arg Lys Arg Ser Ser He Gly Giy Phe Asp Phe 

320 325 330 

ATG CCT TCT CCT CCG CCT ACT TAG TAC CAG AAT CTC AAG AAA CGG GTC 105 8 

Met Pro Ser Pro Pro Pro Thr Tyr Tyr Gin Asn Leu Lys Lvs Arg Vai 
335 340 345 ^ 350 

GGC GAC GTG CTC AGC GAT GAT CAG ATC AAG GAG TGT GAG G.AA TTA GGG 110 6 

Gly Asp Val Leu Ser Asp Aso Gin He Lys Glu Cys Giu Giu Leu Gly 

355 ' 360 365 



37 



wo 97/49816 PCT/US97/11295 

ATT CTT GTA GAC AGA GAT GAT CAA GGG AGO TTG CTT CAA ATC TTC ACA 115 A 
lie Leu Vai Asp Arg Asp Asd Gin Giv Thr Lou Leu Gin lie Ph*r Thr 
370 * 375 330 

AAA CCA GTA GGT GAC AGG CCG ACG ATA TTT ATA GAG ATA ATC CAG AGA 1202 
Lys b>ro Leu Gly Asp Arg Pro Thr lie Phe I.le Giu lie He Gin Arq 
385 390 395 

GTA GGA TGC ATG ATG AAA GAT GAG GAA GGG AAG GCT TAC CAG ACT 3GA 12 50 
Val Gly Cys Met Met Lys Asp Giu Giu Gly Lys Ala Tyr Gin Ser Giy 
400 405 410 

GGA TGT GGT GGT TTT GCC AAA GGC A/\T TTC TCT GAG CTC TTC pJ^.G TCC 3 2 98 
Gly Cys Giv Giy Phe Ala Lys Giy Asn Phe Ser Giu Leu Phe Ly.'; Ser 
415 ' 420 425 -^30 

ATT GPuA GAA TAC GAA AAG ACT CTT GAA GCC AAA CAG TTA GTG GGA 134 3 

He Giu Giu Tyr Giu Lys Thr Leu Giu Ala Lys Gin Leu Vai Gly 
4 35 4 4 0 4 4 !■ 

TGAACAAGAA GAAGAACCA?. CTAAAGGATT GTGTAATTAA TGTAAPJ^.CTG TTTT.-TCTTA 14 C3 

TCAPJ^J\CPJ^T GTATACAA.CA TCTCATTT/vP. A^ACGAGATC .a^.TCC 14 4 8 

{2} INFORMATION FOR 3LQ ID NO : 3 : 

(i) sequence: CilARACTERIGTlCS : 

(A) LENGTIi: 44 5 ammo acids 
(E) TYPE: amino acid 
( D) TOPOLOGY : L inear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEg ID NO : 3 : 

Met Gly His Gin Asn Ala Ala Val .Ser Giu Asn Gin Asn H:; Asp Asp 
1 5 10 15 

Giy Aia Aia Ser Ser Pre Giy Phe Lys L-u Vai Giy The S^r Lys Phe 
2 0 '-■ 5 - - 

Vai Arg Lvs Asn Pro Lys Ser Aso Lys Plie Lys Vai Lys .'r: z t^h^ His 
3 5 4 0 4^ 

His iie Giu Phe Trp Cys Giy Asp Ala Thr Asn Vr:il A^a Arc Arc Phe 



50 



60 



Ser Tro Giy Leu Giy Met Arg Phe Ser Ala Lys Ser Asp Leu Ser Thr 

65 ' 70 75 30 

Giv Asn Met Val His Aia Ser Tvr Leu Leu Thr Ser Gly Giu Leu Arc: 

8 5 90 95 

Phe Leu Phe Thr Aia Pro Tvr Ser Pro Ser Leu Ser GJ y Gly Giu lie 

100 105 i:o 

Lys Pro Thr Thr Thr Gly Ser lie Pro Ser Phe Asp His Giy Ser Cys 
115 120 125 

Arq Ser Phe Phe Ser Ser His Gly Leu Gly VjI Arq Pro Val Aia lie 
130 135 1-10 

Giu Vai Giu AsD Ala Giu Ser Ala Pho Ser lie Ser Vai Ala Asn Gly 

i45 " 150 - 1S5 160 
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Ala lie Pro Ser Scr Fro Pro lie V:-i Leu Asn Giu Ala Va I Thr Il-i- 

165 17C 175 

Ala Glu Val Lys Leu Tyr Gly Asp Veil Val Leu Aro Tvr Vai Ser Tyr 

180 13 5 ' 190 

Lys Ala Giu Asp Thr Giu Lys Ser Glu Phe Leu Pro Glv Phe Glu Arq 

195 200 205 

Val Glu Asp Ala Ser Ser Phe Pro Leu Aso Tyr Gly lie Arq Arq Leu 

210 215 ' 220 

Asp His Ala Val Gly Asn Val Fro Glu Leu Gly Pro Ala Leu Thr Tyr 



Val Ala Gly Phe Thr Gly Phe His Glr: Pho Ala Glu V\\e Thr Ala Asp 
245 250 255 



Asp Val Gly Thr Ala Glu Ser Gly Leu Asn Ser Ala val Leu Ala Ser 
260 265 270 

Asn Asp Glu Met Val Leu Leu Pro lie Asn Glu Pro Val His Glv Thr 
275 280 285 

Lys Arq Lys Ser Gin lie Gin Th.r Tvr Leu Glu Mrs Asn Glu Glv Ala 
290 2 95 * 300 

Gly Leu Gin His Leu Ala Leu Met r.-.:- O.u Asp lie Piie Arq Thr Leu 
305 310 - 315 320 

Arg Glu Met Arq Lys Arq Ser Ser ile Glv Giv Prse Asp Phe Met Pre 
325 330 ^ 335 

Ser Pro Pro Pro Thr Tyr Tyr Gin Asn Lou Lys Lys Aro Val Gly Asp 
340 3 350 

Va 1 Leu Ser Asp Asp Gin lie Lys Giu Cys Giu Giu Leu Gly lie Leu 
355 360 365 

Val Asp Arq Asp Asp Gin Gly Thr Leu Leu GJ n lie Phe Thr Lys Prc 
3*^0 37 S 330 

Leu Gly Aso Arc Pro Thr I riie i ^ o Giu lie lie Gin Ard Val GJ v 
355 390 395 " ^100 

Cys Men Met Lys Asp Glu Giu GJ y Lys Ala Tyr Gin Ser Gly Gly Cys 
405 ^ilO 415 

Gly Gly Phe Ala Lys Gly Asn Phe Ser Glu Leu Phe Lvs Ser lie Glu 
4 20 4 25 '430 

Glu Tyr Giu Lys Thr Leu Glu Ala Lys Gin Levi Vai Giv 
4 35 -HO " 44 5 

(2) INFORMATION FOR SEQ ID WO : 4 : 

(i) SEQUENCE CHAPJVCTERIGTICS : 

(A) LENGTH: 53 base pairs 

(B) TYPE: nucleic acid 
(C; STRANDEDNESS : sinqie 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (qenomic) 
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(xi) SEQUENCE DESCRIPTION: S£0 ID NO : : 
TATGTCCAAG TTCGTA/^.GAA AGAATCCAAA GTCTGATAAA TTC.AAGGTTA AGC 5 5 

(2) INFORMATION FOR SEO ID NO: 5: 

(i) SEQUENCE CHAR-ACTERISTIC5 ; 

(A) LENGTPi: 1 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: DNA (qenomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 

GCTTAA.CCTT GA/^iTTTATCA GACTTTGGAT TCTTTCTTAC GAACTTGGAC A SI 

(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 392 amino acids 
{B) TYPE: ammo acid 
;C: STRANDEDNESS: ^^mqic; 
(C: TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

Thr Ser Tyr Ser Aso Lvs Gly Giu Lys Pro Glu Arq Gly Arq Phe Leu 
1 5 ' ' 10 15 

His Phe His Ser Thr P:ie Trp Vol Civ Asn Ala Lys Gin Ala Ala 

20 25 30 

Ser Tyr Tyr Cvs Ser Lys lie Gly Phe GIli Pro Leu Ala Tyr Lys Gly 
35 ^ 4 0 

Leu Glu Thr Gly Ser Arq Glu Val Val Ser His Vctl Val Lys Gin Asp 
50 55 (^0 

Lvs He Val Phe Phe Scr Ser Ala Leu Asn i'zo Trp Asn Lys Glu 

6 5 7 0 7 5 S 0 

Met Gly Asp His Leu Val Lys His Gly Asp Gly Val Lys Asp lie Ala 
8 5 90 

Phe Glu Val Glu Asp Cys Asp Tyr He Val CI:- Lys Ala Arc Glu Arq 
100 ICS 110 

Gly Ala He lie Vai Arg Giu Glu Val Cys Cys Ala Ala Asp Val Arq 
115 120 125 

Gly His His Thr Pro Leu Asp Arg Ala Arq Gin Val Trp Glu Gly Thr 
130 135 140 

Leu Val Glu Lys Met Thr Phe Cys Leu Asp Ser Arq Pro Gin Pro Ser 
145 150 155 160 

Gin Thr Leu Leu His Arg Leu Leu Leu Ser Lys Leu Pro Lys Cys Gly 
1 6 5 17 0 17 5 

Leu Glu He He Aso His He Val Gly Asn Gin Pro Asp Gin Glu Met 
180 19 5 190 
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Giu Ser Ala Ser Gin Trp Tyr Met Ara Asn Leu Gin Phe His Arg Phe 
195 200 " 205 

Trp Ser Vai Asp Asp Thr Gin lie His Thr Giu Tvr Ser Aia Leu Arg 
210 215 220 

er Vai Vai Met Aia Asn Tyr Giu Giu Ser lie Lvs Met Pro lie Asn 

5 230 235 ^ 240 



Giu Pro Ala Pro Gly Lys Lys Lys Ser Gin lie Gin Giu Tvr Vai Asp 
245 250 ' 255 

Tyr Asn Ciy Giy Aia Giy Vai Gin His lie Ala Leu L\-5 Thr Giu Asd 
260 265 ' 270 

lie lie Thr Ala lie Arg Ser Leu Ara Giu Arg Glv Vai Giu Phe Leu 
275 280 ' 285 

Aia Vai Pro Phe Thr Tyr Tvr Lys Gin Leu Gin Giu Lvs Leu Lys Ser 
290 295 * 300 " ^ 

Aia Lys lie Arg Vai Lvs Giu Ser ile Aso Vnl Leu Giu Giu Leu Lys 
305 3io ' 315 320 

lie Leu vai Asp Tyr Asp Giu Lys Gly Tyr Leu Leu Glr; Ile Phe Thr 
325 330 335 

Lys Pro Met Gin Asp Arg Pro Thr Vai Phe Leu Giu Vai lie Gin Arg 
340 345 350 

Asn Asn His Gin Giy Phe Gly Ala Gly Asn Phe Asn Ser Leu Phe Lys 
355 360 ' 365 

Aia Phe Giu Giu Giu Gin Giu Leu Arq Glv Asn Leu Thr Aso Thr Asp 
370 375 ' 380 

Pro Asn Giy Vai Pro Phe Arq Leu 
385 390 



(2) IK^FORt-lATION FOR SEQ ID NO: 



/ : 



{i} SEQUENCE CHAPJ\CTERI5TICS : 

(A) LENGTH: 392 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ IP NO:?: 

Thr Ser Tyr Ser Asp Lys Gly Giu Lys Pro Giu Arg Glv Arq Phe Leu 
i 5 ^10 ' 15 

His Phe His Ser Vai Thr Phe Trp Vai Gly Asn Aia Lys Gin Ala Aia 
20 25 30 

Ser Tyr Tyr Cys Ser Lys lie Giy Phe Giu Pro Leu Aia Tyr Lys Gly 
35 , 4 0 4 5 

Leu Giu Thr Giy Ser Arg Giu Vai Vai Ser fiis Vai Vai Lys Gin Asp 
50 55 60 

Lys lie Vai Phe Vai Phe Ser Ser Aia Lou Asn Pro Tro Asn Lys Giu 
65 70 75 ' 80 
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Met Gly Asp His Lou Val Lys His Gly Asd Gly Vai Lys Asp lie Aia 

85 ' 90' ' ' * 95 

Phe Giu Val Glu Asp Cvs A^p Tvr lie Vai Gin Lys Aia Arg Giu Arq 

100 105 110 

Gly Aia lie lie Vai Ara Giu Giu Val Cys Cys Aia Aia Aso Val Arg 

115 ' 120 120 

Gly His His Thr Pro Leu Aso Arg Ala Arq Gin Vai Tro Glu Glv Thr 

130 135 ' 140 

Leu Val Glu Lys Met Thr Phe Cys Leu Asd Ser Arq Pro Gin Pre Ser 

IA5 150 ' 155 ' 160 

Gin Thr Leu Leu His Ara Leu Leu Leu Ser Lys Leu Pro Lvs Cys Gly 

165 ' 1-^0 ' 175 

Leu Glu lie lie Asp His lie Vai Giv Asn Gin Pro Aso Gin Glu Met 

180 IBL " 1 90 

Giu Ser Ala Ser Gin Trp Tyr Met Arg A;^n Leu Glr^ Phe riis Ar:^ Phe 

195 2 00 ' 20 5 

Trp Ser Vtil Asp Asp Thr Giri lie His Thi Glu Tyr Sei Aid i.e'j Arq 

210 " J 1 5 2 2 0 

Ser Val Val Met Ala Asn Tyr Glu Glu Sor I^e l^ys Met Pro lie Asn 

225 230 235 24C 

Glu Pro Ala Pro Gly Lys Lvs Lys Ser Gin lie Gin Glu Tvr Val Aso 

24 5 2bO " 2 55 

Tyr Asn Gly Gly Ala Gly Val Gin His lie Aia Leu Lys Thr Glu Asp 

260 265 270 

lie lie Thr Aia lie Arc Ser Leu Ara G!u Arq G^y Vai Glu Phe Leu 

275 280 ' ' 285 

Ala Val Pro Phe Thr Tyr Tyr Lvs Gin Leu Gin Glu Lys Leu Lvs Scr 

290 295 300 

Ala Lys lie Arg Val Lys Glu Ser lie Asp Vnl Leu Glu G.j Leu Lvs 

305 " 310 " ::15 320 

lie Leu Val Asp Tvr Asd Glu Lvs Gly Tyr Leu Leu Gin lie Phe Th.r 

325 ' 330 335 

Lvs Pro Met Gin Asp Arq Pro Thr Val Phe Leu Glu Val lie Gin Ara 

340 345 350 

Asn Asn His Gin Gly Phe Gly Aia Gly Asn Phe Asn Ser Leu Pnc Lys 

355 360 365 

Aia Phe Glu Glu Glu Gin Giu Leu Arg Glv Asn Leu Thr Asd Thr Asp 

370 375 ' 330 

Pre Asn Gly Vai Pro Phe Arg Leu 

3S^ 390 

(2- INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHAFACTERISTIC3 : 

(A) LENGTH: 392 amino acids 

(B) TYPE: amino acid 
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{C) STRANDEDNBSS : sinqie 
(D) TOPOLOGY: linear 

(iij MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ I D NO : 8 : 

Thr Thr Tyr Asn Asn Lys Gly Pro Lys Pro Glu Arg Glv Arg Phe Leu 
1 5 10 ^ 15 

His Phe His Ser Val Thr Phe Val Gly Asn Ala Lys Gin Ala Ala 

20 25 30 

Ser Phe Tyr Cys Asn Lys Met Gly Phe Glu Pro Leu Ala Tyr Ara Glv 
35 AO c^5 

Leu Glu Thr Gly Ser Arq Glu Val Vol Ser His Val lie Lys Arg Giv 
SO '55 60 

Lys He Val Phe Val Leu Cys Ser Ala Leu Asn Pro Trp Asn Lys Glu 

70 75 ' 80 

Met Gly Asp His Leu Val Lys iiis Glv Asd Gly Val Lvs Asp He Ala 
8 5 ■ ' 90^ ' 95 

Phe Glu Vai Glu Asp Cys Asp Hrs lie Val Gin Lys Ala Arq Glu Arc 
■iOO i05 116 

Gly Ala Lys ile Val Arq Glu Pro Tro Val Glu Gin Asp Lys Phe Gly 
H5 120 12 5 

Lys Val Lys Pne A.l a VcjI Leu Gin Tnr Tvr Gly Aso Thr Thr Mrs Thr 
130 135 ' 140 

Leu Val Glu Lys He Asn Tyr Thr Gly Arg Phe Leu Pro Glv Phe Glu 
145 150 155 ' 160 

Ala Pro Thr Tyr Lys Asp Thr Leu Leu Pro Lys Leu Pro Arq Cys Asn 
165 170 ' ' 175 

Leu Glu He He Asp His He Val Glv Asn 3 1 p. Pro Asp Gin Glu Met 
ISO iH^: 190 

Gin Ser Ala Ser Glu Trp T\-r Leu Lvs Asr; Leu Gin Phe iiis Arq Phe 
195 200 205 

Trp Ser Val Asp Asp Thr G:Ln Val His Thr Glu Tyr Ser Ser Leu Ara 
210 215 220 

Ser He Val Val Thr Asn Tyr Glu Glu Ser He Lvs Met Pro He Asn 
225 230 235 ^ 240 

Glu Pro Ala Pro Gly Arg Lys Lys Ser Gin He Gin Glu Tyr Val Asc 
245 250 255 

Tyr Asn Gly Gly Ala Gly Val Gin His He Ala Leu Lys Thr Glu Aso 
260 265 270 

He He Thr Ala He Arc His Leu Arg Glu Arq Glv Thr Glu Phe Leu 
275 280 " ' 2S5 

Ala Ala Pro Ser Ser Tyr Tyr Lys Leu Leu Arg Glu Asn Leu Lys Ser 
290 295 ■ 300 

Ala Lys He Gin Val Lys Glu Ser Met Aso Val Leu Glu Glu Leu His 
305 310 ' 315 320 
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lie Leu Val Asp Tyr Asp Glu Lys Gly Tvr Leu Leu Gin Tie ?he Thr 
325 330 335 

Lys Pro Met Gin Asp Arq Pro Thr Leu Phe Leu Giu Val lie Gin Arq 
340 345 350 

His Asn His Gin Giy Phe Giy Ala Giy Asn Phe Asn Ser Leu Phe Lys 
355 360 365 

Ala Phe Glu Glu Glu Gin Ala Leu Arg Glv Asn Leu Thr Asp Leu Giu 
370 375 380 

Pro Asn Gly Val Arg Ser Gly Met 
385 390 

(2) INFORMATION FOR SEQ ID NO: 9: 

{i) SZQUENCi: CHAPJ\CTERI3TICr^ : 

(A) LENGTH: 376 amino acids 
\S) TYPE: ammo acia 

STRANDEDHESS : singio 
;::■) topology: linear 

(ix; LvOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION; SEQ TO NO : 9 : 

Tyr Trp Aso Lys Giy Pro L\'s Pro Giu Glv Aro Phe Leu His Phe 

i ^ 5 ' lo" 15 

His Ser Vai Thr Phe Trp Val Giy Asn Ala Lys Gin Ala Ala Ser Phe 
20 ' 25 30 

Tyr Cys Asn Lys Met Gly Phe Glu Pro Leu Aia Tyr Lys Gly Leu Giu 
3 5 4 0 4 5 

Thr Giy Ser Arq Glu Vai Val Ser Kis Val iio. Lys Gin Gly Lys lie 
50 5 5 t.O 

Vai Phe Vai Leu Cys Ser Ala Leu Asn Pro Tr:.- Asn Lys Glu Met Giy 

65 70 '.-5" 

AsD His Leu Vai Lys His Giy Asp Giy Val Lys Asp lie Aid Phe Gi.: 

35 ' 90 95 

Vai Giu Asp Cys Giu His lie Val Gin Lys Ala Arq Gi.j Arg Gly Ala 
100 105 110 

Lys lie Val Ara Giu Pro Trp Vai Giu Giu Asp Lys Phe Gly Lys Val 
115 " 120 125 

Lys Phe Ala Vai Leu Gin Thr Tyr Giy Asp Thr Thr His Thr Leu Vai 
130 135 140 

Giu Lys lie Asn Tyr Thr Giy Arg Phe Leu Pro Giy Phe Giu Ala Pro 
145 150 155 160 

Thr Tyr Lys Asd Thr Leu Leu Pro Lys Leu Pro Ser Cys Asn Leu Giu 
165 170 175 

lie lie Asp His lie Vai Gly Asn Gin Pro Aso Gin Giu Met Giu Ser 
180 185 190 

Ala Ser Giu Trc Tyr Leu Lys Asn Leu Gin L=he His Ara Phe Trp Ser 
195 ' 200 205 
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Vai Asp AsD Thr Gin Val His Thr Giu Tyr Ser Ser Leu Arg Ser lie 
210 ' 215 220 

Vai Vai Ala Asn Tyr Glu Giu Ser lie Lys Met Pro lie Asn Giu Pro 
225 230 235 240 

Ala Pro Gly Arg Lvs Lys Ser Gin lie Gin Giu Tyr Vai Asp Tyr Asn 
2Ab ^ 250 255 

Giy Giy Aia Giy Val Gin His He Alu Leu Arq Thr Glu Asd lie lie 
260 265 * 270 

Thr Thr He Arg His Leu Arg Giu Arg Giy Met Giu Fhe Leu Aia Val 
275 280 285 

Pro Ser Ser Tyr Tyr Ara Leu Leu Arg Glu Asn Leu Lys Thr Ser Lys 
290 " 295 300 

lie Gin Vai Lys Glu Asn Met Asp Vai Leu Glu Giu Leu Lys lie Leu 
3C5 ' 310 315 320 

Val AsD T-.r Asc Giu Lvs Giv Tyr Leu Leu Gin lie br.e Thr Lys Pro 
325 330 ?35 

:-:et Gin Aso Arc Pro Thr Leu Pho Leu Glu Val He Gir: Arq His Asn 
340 345 350 

H^s Gin Glv Phe Gly Ala Gly Asn Phe Asn Ser Leu Phe Lys Ala Phe 
355 360 3t5 

Glu Giu Giu Gin Aia Leu Arq Gly 
370 375 

{2; INFORMATION FOR SEQ ID NO: 10: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1766 base pairs 

(B) TYPE: nucleic acid 
{O 3TRANDEDNESS : single 
{D} TOPOLOGY: linear 

{ii; MOLECULE TYPE: cDNA tc mRNA 

(111) HYPOTHETICAL: NO 

{iv} ANTI -SENSE: NO 

{vi} ORIGINAL SOURCE: 

(A) ORGANISM: Zea mays 

(iy) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 2 61.. 15 95 

{xi; SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

ACTAGTTGTG AGAGCCTTCT GCGTTGGCAA TTGGCAGTAC .AAGACAAATC ACATCCGCAA 60 

CCGC.A.^CC.=.C AGAATCGTCC GTCCACGTGG CCCCCATCAC TTCCCTTTAT TTACC.AGTCC 120 

TCCCCCATCC CCAGGGCCAC CCACCAACAA GTGCAGTCAC CCGAGCCCCA TyACTGCAGCT 180 

CTGCAAGCTA CAGAGGCCAC CACGAGTCCA CGACGCCACG CCCTCCGAGA GAAA.GAGhAP^. 24 0 



45 



wo 97/49816 PCTAJS97/1 1295 

GAGAAAACCA AAGCACGATA ATG CCC CCG ACC CCC ACA GCC GCC GCA GCC j90 

Met Pre Pro Thr Pro Thr Ala Aia Ala Aia 
1 S 10 

GGC GCC GCC GTG GCG GCG GCA TCA GCA GCG GAG CAA GCG GCG TTC CGC 3 38 
Giy Aia Aia Vai Aia Aia Aia Ser Aid Ala Giu Gin Ala Ala Phe Arg 
15 20 25 

CTC GTG GGC CAC CGC AAC TTC GTC CGC TTC AAC CCG CGC TCC GAG CGC 38 6 
Leu Vai Gly His Arg Asn Phe Val Arq Phe Asn Pro Arg Ser Asp Arg 
30 35 40 

TTC CAC ACG CTC GCG TTC CAC CAC GTG GAG CTC TGG TGC GCC GAC GCG 4 34 
Phe His Thr Leu Aia Phe His Mrs Vai Giu Leu Trp Cys Aia Aso Aia 
45 50 ' 55 

GCC TCC GCC GCG GGC CGC TTC TCC TTC GGC CTG GGC GCG CCG CTC GCC 4 82 
Aia Ser Aia Aia Giy Arg Phe Ser Phe Gly Leu Gly Aia Pro Leu Ala 
60 65 ' 7 0 

GCA CGC TCC GAC CTC TCC ACG GGC AAC TCC GCG CAC GCG TCC CTG CTG 5 30 
Aia Arg Ser Asp Leu Ser Thr Gly Asn Ser Aict fU:3 Ala Ser Leu Leu 
'5 80 8 5 90 

CTC CGC TCC GGC TCC CTC TCC . TTC CTC TTC ACG GCG CCC TAC GCG CAC 57B 
Leu .Arq Ser Giy Ser Leu Ser Phe Leu Phe T^lr Aia T ro Tyr Ala Hi:- 
95 100 105 

GGC GCC GAC GCT GCC ACC GCC GCG CTG CCC TCC TTC TCC GCC GCC CCC 62 6 
Giy Aia Asp Aia Ala Thr Ala Ala Leu Pro Ser Phe Ser Aia Ala Ala 
110 115 120 

GCG CGG CGC TTC GCA GCC GAC CAC GGC CTC GCG GTG CGC GCC GTC GCG 67 4 
Ala Arg Arg Phe Ala Ala Asp His Giv Lou Ala Val Arq Aia Vai A.la 
125 130 135 

CTC CGC GTC GCC GAC GCC GAG GAC GCC TTC CGC GCC AGC GTC GCG GCC 72 2 
Leu Arg Val Aia Asp Aia Giu Asp Ala Phe A.rq Ala Ser Val Ala Ala 
140 145 .150 

GGG GCG CGC CCG GCG TTC GGC CCC GTC GAC CTC GGC CCC GGC TTC CGC 7 70 
'M'^ Ala Arg Pro Aia Phe G 1 y P r o V r i \ h 3 o i.. e u 1 \' A 1. o G i v P h c A r a 
155 " 160 ' ' 165 ' 170 

CTC GCC GAG GTC GAG CTC TAC GGC GAC GTC GTG CTC CGG TAC GTG AGC 8 18 
Leu Aia Giu Val Giu Leu Tyr Giy Asp Val Vai Leu Aiq Tyr Val Ser 
175 180 ' 185 

TAC CCG GAC GGC GCC GCG GGC GAG CCC TTC CTG CCG GGG TTC GAG GGC 8 66 
Tyr Pro Asp Gly Aia Ala Gly Giu Pro Phe Leu Pro Gly Phe Giu Gly 
190 195 ' 200 

GTG GCC AGC CCC GGG GCG GCC . GAC TAC GGG CTG AGC AGG TTC GAC CAC 914 
Vai Aia Ser Pro Gly Ala Ala Asp Tyr Gly Leu Ser Arg Phe Asd His 
205 210 215 

ATC GTC GGC AAC GTG CCG GAG CTG GCG CCC GCC GCC GCC TAC TTC GCC 962 
lie Vai Giy Asn Val Pro Giu Leu Aia Pro Ala Ala Ala Tyr Phe Aia 
220 225 230 

GGC TTC ACG GGG TTC CAC GAG TTC GCC GAG TTC ACG ACG GAG GAC GTG 1010 
Giy Phe Thr Gly Phe His Giu Phe Aia Giu Phe Thr Thr Giu Aso Vai 
235 240 245 * 250 
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GGC ACC GCG GAG AGO GGC CTC .AAC TCC ATG GTG CTC GCC AJ\C AAC TCG 1058 

Gly Thr Aia Giu Ser Gly Leu Asn Ser Met Vai Leu Ala Asn Asn Ser 

255 260 265 

GAG AAC GTG CTG CTC CCG CTC AAC GAG CCG GTG GAG GGC ACC AAG CGC 1106 

Giu Asn Val Leu Lou Pro Leu Asn Giu Pro Vai His Gly Thr Lvs Arq 

270 275 260 

CGC AGC GAG ATA CAA ACG TTC CTG GAC CAC CAC GGC GGC CCG GGC GTG 1154 

Arg Ser Gin lie Gin Thr Phe Leu Asp His His Gly Glv Pro Gly Vai 

285 290 295 

CAG CAC ATG GCG CTG GCC AGC GAC GAC GTG CTC AGG ACG CTG AGG GAG 2 2 02 

Gin His Met Ala Leu Ala Ser Asd Asp Val Leu Arg Thr Leu Arg Giu 

300 305 310 



i V^' 



ATG CAG GCG CGC TCG GCC ATG GGC GGC TTC GAG TTC ATG GCG CCT CCC 
Met Gin Ala Arg Ser Aia Met Gly Glv Phe Giu Phe Met Ala Pro Pro 
315 320 ' " 325 330 

AC A TCC GAC TAC TAT GAC GGC GTG AGG CGG CGC GCC GGG GAC GTG CTC 12 98 
Thr Ser Asp Tyr Tyr Aso Gly Vai Arq Arq Arq Ala Gly Asp Val Leu 
335 34 0 ' ' 34 5 

ACG GAA GCA CAG ATT AAG GAG TGC CAG GAG CTA GGG GTG CTG GTG GAC 13^6 
Thr Giu Ala Gin lie Lys Giu Cvs Gin Giu Leu Glv Val Leu Vai Aso 
350 ' 355 ' 360 

AGG GAT GAC CAG GGC GTG CTG CTC ATC TTC ACC AAG CCA GTG GGG 13 9^ 

Arg- Asp Asp Gin Gly Val Leu :_eu Gin He Phe Thr Lys Pro Val Gly 
365 370 375 

GAC AGG CCA ACG CTG TTC TTC GAA ATC ATC C.A.A AGG ATC GGG TGC ATG 1^4 2 
Asp Arq Pro Thr Leu Phe Leu Giu He lie Gin Arq Xlo Gly Cv3 Met 
380 385 390 

GAG AA.G GAT GAG AP,G GGG CA.A G?'J\ TAC CAA, AAG GGT GGC TGC GGC GGG 14 90 
Giu Lys Asp Giu Lys Gly Gin Giu Tyr Gin Lys Gly Gly Cys Gly Gly 
395 ^00 ' 405 ' ^ 410 

TTC GGC AAG GGA AAC TTC TCG CAG CTG TTC AAG TCC ATC GAG GAT TAT 15 33 
Phe Gly Lys Gly Asn Phe Ser :^Ln Leu Phe Lvs Ser lie Giu Aso Tvr 
4 15 4 20 ' 4 25 ' 

GAG A/\G TCC CTT GAA GCC AAG C.^^JK GCT GCT GCA GCA GCT GCA GCT CAG 1586 
Giu Lys Ser Leu Giu Ala Lys Gin Ala Aia Aia Aia Ala Aia Ala Gin 
430 435 440 

GGA TCC TAG GACAGTGCTT GGAGACGAGC A.-\CTGCTGTG GCACTTTGTA 163 5 
Gly Ser 

TCATGGAACA GAAATAATGA AGCGTGTTCT TTGTGACACT TGACATGCAA ATGTTTGTGT 16 95 

TCTGTAACCG TTGAATATAT GGGACGATGC TATGATGGTG T.^uATAGATGG TAGAGAGGGT 17 55 

ACAACCCTGA T 17 66 
(2) INFORMATION FOR SZQ ID NO : 1 1 : 

(i) sequence: CfiARACTERISTICS: 

(A) LENGTH: 445 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xil SEQUENCE DESCRIPTION: SEQ ID NO : 1 1 : 

Met Pro Pro Thr Pro Thr Ala Ala Ala Ala Gly Ala Ala Val Ala Aia 
15 10 15 

Ala Ser Ala Aia Giu Gin Ala Aia Phe Arq Leti Val Giy His Arq Asn 
20 25 30 

Phe Val Arq Phe Asn Pro Arq Ser Asp Arg Phe His Thr Leu Ala Phe 
35 40 45 

His His Val Giu Leu Trp Cys Ala Asp Aia Ala Ser Ala Ala Gly Arg 
50 55 60 

Phe Ser Phe Giv Leu Giy Ala Pro Leu Aia Ala Arq Ser Asp Leu Ser 
65 ^ ^O 75 80 

Thr Gly Asn Ser Aia His P^la Ser Leu Leu Leu Arq Ser Gly Ser Leu 
85 90 95 

Ser Phe Leu Phe Thr Ala Pro Tvr Ala His Giv Ala Asp Ala Aia Thr 
100 105 lie 

A. la Aia Leu Pro Ser Phe Ser Aia Ala Aia Ala Arq Ar'-7 Pr-.o A^ci Aia 
115 120 125 

Asp His Gly Leu Aia V*j 1 Arq Ala Val Aia Leu Arq Val A_a Asp Aia 
130 135 140 

Giu Asp Aia Phe Arq Ala Ser Val Al« Aia Gly Ala Arg Pro Aia Phe 
145 150 155 160 

Gly Pro Val Asp Leu Gly Arg Giy Phe A.rc Leu Ala Giu Val Giu Leu 
165 170 175 

Tyr Giy Asd Val Val Leu Ara Tvr Val Ser lyr Pro Asp Gly Aia Ala 
' 18 0 ^ ' 18 5 190 

Gly Giu Pro Phe Leu Pro Gly Phe Giu Gly Val Ala Ser Pro Giy Aia 
195 200 205 

Ala ASD Tyr 01 Leu Scr Arq Phe Ajd iiis lie Vdl G-y Asn Val Pro 
210 ' 215 220 

Giu Leu Ala Pro Ala Ala Aia Tyr Prie Aia Gly Pnc Tnr Gly Phe Uis 
225 230 235 240 

Giu Phe Aia Giu Phe Thr Thr Giu Asp Val Gly Thr Ala Giu Ser Gly 
245 250 255 

Leu Asn Ser Mec Val Leu Ala Asn Asn Ser Giu Asn Val Leu Leu Pro 
260 265 270 

Leu Asn Giu Pro Val His Gly Thr Lvs Arq Arq Ser Gin lie Gin Thr 
275 2B0 285 

Phe Leu Asp His His Gly Gly Pro Gly Val Gin His Ket Ala Leu Ala 
290 295 300 

Ser Asp Asp Val Leu Arg Thr Leu Arq Giu Met Gin Ala Arq Ser Aia 
305 ' 310 315 320 

Met Gly Gly Phe Giu Phe Met Ala Pro Pro Thr Ser Asp Tyr Tyr Asp 
325 330 335 
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Gly Val Arq Arg Arg Ala Giy Asp Vai Leu Thr Giu Aia Gin lie Lys 
340 345 350 

Glu Cys Gin Giu Leu Giy Val Leu Val Asp Arg Asp Asp Gin Giv Val 
355 360 365 

Leu Leu Gin Tie Phe Thr Lys Pro Vai Gly Asp Arc Pro Thr Leu Phe 
370 375 3S6 

Leu Glu lie lie Gin Arg lie Giy Cys Met Glu Lvs Asd Glu Lys Gly 
385 390 395 ' " 400 

Gin Giu Tyr Gin Lys Gly Gly Cys Glv Gly Phe Gly L\-s Glv Asn Phe 
405 410 ' ' 415 

Ser Gin Leu Phe Lys Ser lie Glu Asp Tyr Glu :.ys Ser Leu Glu Ala 
420 425 430 

Lys Gin Ala Ala Ala Ala Ala Ala Ala Gin Gly Ser 
435 440 

^2) INFORMATION FOR SEQ ID NO: 12: 

;:; SEQUENCE Cf lAFL?\CTEai 3T ICS : 

(A) LENGTH: 1 356 base oairc^ 

(B) TYPE: nucleic acici* 

(C) 11TPJ\NDEDNE33: rjoubie 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 

{iiil HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Arabidcpsis thaiiana 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1 . . 12 54 

;ix) FEATURE: 

(A) NAME/KEY: misc_ tea t ure 

(B) LOCATION: 1 . , j 

(D) OTHER INFORMATION: / s tandard_name= 

"translation initiation 
codon " 

(ix) FEATURE: 

(A) NAl^E/KEY: misc_t:eature 

(B) LOCATION: 1252.-1254 

(D) OTHER INFORMATION; / s tandard_name= 

"translation termination 
codon" 

(xij SEQUENCE DESCRIPTION: SEQ ID NO: 12; 

ATG TCC AAG TTC GTA AGA AAG AA.T CCA AAG TCT GAT AAA TTC AAG GTT 4 8 

Met Ser Lys Phe Val Arg Lys Asn Pro Lvs Ser Asp Lys Phe Lys Val 
1 5 10 15 

AAG CGC TTC CAT CAC ATC GAG TTC TGG TGC GGC GAC GCA ACC AAC GTC 96 
Lys Arg Phe His His lie Giu Phe Trp Cvs Glv Asp Ala Thr Asn Val 
20 25 ' ' 30 
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GCT CGT CGC TTC TCC TOG GGT CTC GGG ATG AGA TTC TCC GCC AAA TCC 14 A 

Ala Arg Arg Phe Ser Trp Giy Leu GLy Met Arq Phe Ser Ala Lys Ser 

35 40 45 

GAT CTT TCC ACC GGA A.AC ATG GTT CAC GCC TCT TAG CTA CTC ACC TCC 192 

Asp Leu Ser Thr Giy Asn Met Vax His Aia Ser Tyr Leu Leu Thr Ser 

5 0 5 5 60 

GGT GAC CTC CGA TTC CTT TTC ACT GCT CCT TAG TCT CCG TCT CTC TCC 24 0 

Giv Asp Leu Arg Phe Leu Phe Thr Ala Pre Tyr Ser Pro Ser Leu Ser 

65 7 0 7 5 8 0 

GCC GGA GAG ATT AAA CCG ACA ACC ACA GCT TCT ATC CCA AGT TTC GAT 28 8 

Ala Giy Glu He Lys Pro Thr Thr Thr Ala Ser He Pro Ser Phe Asp 

85 90 95 

CAC GGC TCT TGT CGT TCC TTC TTC TCT TCA CAT GGT CTC GGT GTT AGA 336 

His Giy Ser Cys Arg Ser Phe Phe Ser Ser Hir> Giy Leu Giy Val Arg 

100 105 ' 110 

GCC GTT GCG ATT GA-A GTA GAA GAC GCA GAG TCA GCT TTC TCC ATC AGT 33 4 

Aia Val Aia He Glu Val Glu Asp Ala Glu Ser Ala Phe Ser He Ser 

115 120 125 

GTA GCT AAT GGC GCT ATT CCT TGG TCG ^.:CT CCT ATC CTC CTC AAT GAA -i 32 

Val Aia Asn Giv Ala He Pro Ser Ser Pro Pro He Va L Leu Asn Gin 

130 135 MG 

GCA GTT ACG ATC GCT GAG GTT AAA CTA TAG GGC CAT GTT GTT CTC CGA ABO 

Aia Val Thr He Aia Giu Vul Lys Leu Tyr Giy Asp Vai Vai Leu Arg 

145 ISO 155 IGO 

TAT GTT AGT TAG AJVA GCA GAJ^^ GAT ACC G.AA AA^A TCC GAA TTC TTG CCA 5 28 

Tyr Val Ser Tyr Lys Aia Giu Asd Thr Glu Lys Ser Glu Phe L^u Pro 

165 170 175 

GGG TTC GAG CGT GTA GAG GAT GCG TCG TCG TTC CCA TTG GAT TAT GGT 57 6 

Giy Phe Glu Arg Val Glu A*sp Aia Ser Ser Ph.e Pro Lou Asp Tyr Giy 

180 165 190 ^ 

ATC CGG CGG CTT GAC CAC GCC GTG GGA A„^.C GTT CCT GAG CTT GGT CCG 62 4 

He Arq Arq Leu Asp His Aia Val Giy Asn Val Pro GLu Leu Gl\- Pro 

195 200 205 

GCT TTA ACT TAT GTA GCG GGG TTC ACT GGT TTT CAC CAJi. TTC GCA GAG .37 2 

Aia Leu Thr Tyr Val Ala Giy Phe Tr.r Giy Phe His Gin Phe Ala Glu 

210 215 220 

TTC ACA GCA GAC GAC GTT GGA ACC GCC GAG AGC GGT T'rA AAT TCA GCG 7 20 

Phe Thr Aia Asp Asp Val Giy Thr Ala Giu Ser Giv i-eu Asn Ser Ala 

225 230 235 ' 240 

GTC CTG GCT AGC AAT GAT GAA ATG GTT CTT CTA CCG ATT AJ\C GAG CCA 7 68 

Val Leu Aia Ser Asn Asp Giu Met Val Leu Leu Pro He Asn Giu Pro 

245 250 255 

GTG CAC GGA ACA AAG AGG AAG AGT CAG ATT GAG ACG TAT TTG GA/\ CAT 816 

Val Hrs Giy Thr Lys Arg Lys Ser Gin He Gin Thr Tyr Leu Giu His 

260 ' 265 270 

AAC GAA GGC GCA GGG CTA CAA CAT CTG GCT CTG ATG AGT GAA GAC ATA SG4 

Asn Glu Giy Ala Giy Leu Gin His Leu Aia Leu Met Ser Giu Asp He 

275 280 285 
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TTC AGG ACC CTG AGA GAG ATG AGG A.AG AGG AGO ACT ATT GGA GGA TTC 911 
Vhe Arq Thr Leu Arg Giu Met Arg Lys Arq Ser Scr lie Giy Gly ?he 
290 295 ' 300 

GAC TTC ATG OCT TCT OCT CCG OCT ACT T/VC TAG GAG A/vT CTC /\AG ?JkA 9 60 

Asp Phe Met Pro Ser Pro Pre Pro Thr Tvr Tyr Gin Asn L-';eu Lvs Lyr^ 
305 310 " 315 320 

CGG GTC GGC GAC GTG CTC AGC GAT GAT GAG ATC A-AG GAG TGT GAG GAA 1006 
Arg Vai Giy Asp Val Leu Ser Asp Asp Gin lie Lys Giu Cys Glu Giu 
325 330 335 

TTA GGG ATT CTT GTA GAC AGA GAT GAT CAA GGG ACG TTG CTT CAA. ATC 105 6 
Leu Giy lie Leu Val Aso Arq Asd Asp Gin Gly Thr Leu Leu Gin lie 
3^0 " ' ^ 345 350 

TTC ACA AAA CCA CTA GGT GAC AGG CCG ACG ATA TTT ATA GAG ATA ATC ilO-l 
Phe Thr Lys Pro Leu Gly Asp Ara Pro Thr lie Phe lie Giu lie lie 
355 360 365 

CAG AGA GTA GGA TGC ATG ATG AA.A GAT GAG GAA GCT- A.AG GCT TAC GAG 1152 
Gin Arg Val Giy Cys Met Met Lys Asd G1^ Glu Gly l.ys Ala Tyr Gin 
370 375 ' 380 

ACT GGA GGA TGT GGT GGT TTT GGC .A.AA GGC Pu\T TTC TCT GA:i CTC TTC 1200 
Ser Gly Giy Cys Gly Gly Phe Gly Lys Giy Asn Pfie Ser Giu Leu Phe 
3B5 ' ' 390 ^ " 395 ^100 

AAG TCC ATT GAJ\ GAA TAC GA.A AAG ACT CTT GAA GGC PJ^ CAG TTA GTG 1218 
Lys Ser lie Glu Glu Tyr Giu Lys Thr Leu Giu Ala Lys Gin Leu Val 
4 05 4 10 ' A 15 

GGA TGA ACA„AGA.AGAA G-AACCAJ\CTA /'^.AGGATTGTG TA^-.TTA-ATGT AA^^^ACTGTTT 130*1 
Gly 

TATCTTATCA .AAAC'vATGTA TACAJ^CATCT CATTTA^AAA^ CGAGATCAAT CC 1356 
(2) INFORMATION FOR SFQ ID t-JO : 1 3 : 

(i) SEQUEMCr. CHARACTERISTICS: 

(A) LENGTH: h18 ammo acids 

(B) TYPE: amino acid 
;D; TOPOLOGY: linear 

(iiJ MOLECULE TYPE: protein 

vxi) SEQUENCE DESCRTPTIOK: SEQ ID NO: 13: 

Met Ser Lys Phe Va i Arg Lys Asn Pro Lys Ser Asp Lys Phe Lys Vai 
1^5 10 15 

Lys Arg Phe His His lie Giu Phe Trp Cys Gly Asp Ala Thr Asn Vai 
20 25 , 30 

Ala Arg Arg Phe Ser Trp Gly Leu Giy Met Arg Phe Ser Ala Lys Ser 
35 40 '15 

Asp Leu Ser Thr Giy Asn Met Val His Ala Ser Tvr Leu Leu Thr Ser 
50 55 60 

Gly Asp Leu Arg Phe Leu Phe Thr Ala Pro Tyr Sor Pro Ser Leu Ser 
65 70 75 SO 

Ala Giy Giu lie Lys Pro Thr Thr Thr Ala Ser lie Pro Ser Phe Asp 
8 5 90 95 
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His Giy Ser Cys Ara Ser Phe Phe Ser Ser Hia Ciy Leu Gly Vai Ara 
100 105 110 

Aia Vai Ala lie Giu Val Glu Asp Ala Glvi Ser Ala Phe Ser lie Ser 
115 120 125 

Vai Aia Asn Glv Aia lie Pro Ser Ser Pro Pro lie Val Leu Asn Giu 
130 ^ 135 140 

Aia Vai Thr lie Aia Giu Vai Lys Leu Tyr Giy Asp Val Vai Leu Arg 
145 150 155 160 

Tyr Vai Ser Tyr Lys Aia Giu Asp Thr Glu Lys Ser Giu Phe Leu Pro 
165 :70 175 

Gly Phe Giu Arg Val Giu Asp Aia Ser Ser Phe Pro Leu Asp Tyr Gly 
180 185 190 

lie Arg Arg Lou Asp His Ala Vai Giv A;>ri Vai Vro Clu Leu Gly Pro 
195 200 205 

Ala Lgu Thr Tvr Vai Ala Glv Phe Thr Gly Ph.e His Gin Phe Ala Giu 
210 ' 215 220 

Phe Thr Aia Aso Asp Vai Giy T'nr Aia G-u Ser G.i.y I r.-.: Asn Ser Ala 
225 230 235 240 

Vai Leu Aia Ser Asn Asp Giu Met Vai Leu Leu Pro lie Asn Giu Pro 
245 250 255 

Vai His Giy Thr Lvs Arg Lvs Ser Gin He Gin Thr Tyr Leu Giu His 
260 ' 265 270 

Asn Giu Giy Aia Giy Leu Gin Hxs Leu Aia Leu Met Ser Giu Asp lie 
275 280 285 

Phe Arg Thr Leu Arg Giu Met Arg Lys Arq Ser Ser lie Gly Giy Phe 
290 295 300 

Asp Phe Met Pro Ser Pre Pro Pro Thr lyr Tvr Gin Asn Leu Lys Lys 
305 310 315 320 

Arg Val Giy Aso Val Leu Ser Asp Asp C'.lri Iltr Lys Giu Cys Glu Giu 
325 '330 335 

Leu Giy lie Leu Vai Asp Arq Asp A^o Gin Giy Thr Leu Leu Gin lie 
340 345 350 

Phe Thr Lys Pro Leu Giy Aso Arg Pro Thr He Phe lie Giu lie lie 
355 ' 360 365 

Gin Arg Vai Gly Cys Met Met Lys Asp Giu Glu Gly Lys Aia Tyr Gin 
370 -375 3B0 

Ser Giy Gly Cvs Gly Giy Phe Giy Lys Giy Asn Phe Ser Glu Leu Phe 
385 " 390 395 400 

Lys Ser lie Glu Glu Tyr Giu Lys Thr Leu Glu Ala Lys Gin Leu Val 
405 4 10 4 15 

Gly * 
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(2) rNFORM.^TION FOR SEQ ID NO : 1 4 : 

(i; SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1448 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

Hl) molecule TYPE: cDNA zo ruRNA 

(iii) HYPOTHETICAL: NO 

fvi) ORIGINAL SOURCE: 

(A) ORGANISM: Arabidopsis thaiiana 

(ix) FEATURE: 

(A) NAf-lE/KEY: CDS 

(3) LOCATION: 9. .1346 

iix) FEATURE: 

{A) NAME/KEY: nisc_feature 

(B) LOCATION: 9. . 11 

(D) OTHER INFORMATION: / 3 anca rd_name = 

"*:ransiatiQn injLiiiation 
.-^odon " 

(ix.i FEATURE: 

(A) NAME/KEY: misc_feacure 

(B) LOCATION: 1344 . . 1346 

(D) OTHER INFORMATION: /3::andard_name = 

"translation termination 
codon '* 

(xii SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

TGAAATCA ATG GGC CAC CAA AAC GCC GCC GTT TCA GAG AAT CAA AAC CAT 50 
Met Gly His Gin Asn Ala Ala Val Ser Glu Asn Gin Asn His 
15 10 

GAT GAG GGC GCT CCG TCG TCG CCG GGA TTC .AAG CTC GTC GGA TTT TCC 98 
Asp Asp Glv Ala Ala Ser Ser Pro Glv Phe Lvs Leu val Gly Phe Ser 
15 2 0 ' :.^ 30 

.AAG TTC GTA AGA .^-AG AAT CCA AAG TCT GAT AA-A TTC P-AG GTT AAG CGC 146 
Lys Phe Val Arc Lys Asn Pro Lys Ser Asp Lys Phe Lys Val Lys Arg 
3 5 4 0 4 5 

TTC CAT CAC ATC GAG TTC TGG TGC GGC GAG GCA ACC AAC GTC GCT CGT 194 
Phe His, His lie Glu Phe Trp Cys Giv Aso Ala Thr Asn Val AJ a Arg 
50 55 * 60 

CGC TTC TCC TGG GGT CTG GGG ATG AGA TTC TCC GCC PJ\A TCC GAT CTT 24 2 

Arg Phe Ser Trp Gly Leu Gly Met Arg Pihe Ser Ala Lvs Ser Asp Leu 
65 70 75 

TCC ACC GGA AAC ATG GTT CAC GCC TCT TAG CTA CTC ACC TCC GGT GAG 2 90 

Ser Thr Gly Asn Met Val His Ala Ser Tyr Leu Leu Thr Ser Glv Asp 
80 85 90 

CTC CGA TTC CTT TTC ACT GCT CCT TAC TCT CCG TCT CTC TCC GCC GGA 336 
Leu Arg Phe Leu Phe Thr Ala Pro Tyr Ser Pro Ser Leu Ser Ala Gly 
95 100 105 110 

GAG ATT AAA CCG ACA ACC ACA GCT TCT ATC CCA ACT TTC GAT CAC GGC 38 6 

Glu Tie Lys Pro Thr Thr Thr Ala Ser lie Pro Ser Phe Asp His Glv 
115 120 ' 125 
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TCT TGT CGT TCC TTC TTC TCT TCA CAT GGT CTC GGT GTT AGA GCC GTT 
Ser Cvs Arq Ser Phe Phe Ser Ser His Gly Leu Giy Val Arq Ala Val 
' 130 135 140 

GCG ATT GAA GTA GAA GAC GCA GAG TCA GCT TTC TCC ATC ACT GTA GCT 
Ala lie Glu Vai Giu Asp Ala Glu Ser Ala Phe Ser Tie Ser Val Ala 
145 150 155 

AAT GGC GCT ATT CCT TCG TCG CCT OCT ATC GTC CTC K^T GA^. GCA GTT 
Asn Gly Ala lie Pro Ser Ser Pro Pro lie Val Leu Asn Glu Ala Vai 
160 165 WO 

ACG ATC CCT GAG GTT AAA CTA TAG GGC GAT GTT GTT CTC CGA TAT GTT 
Thr He Ala Glu Val Lvs Leu Tyr Gly Asp Vai Val Leu Arq Tyr Val 
175 180 1B5 190 

AGT TAG AA.^ GCA GAA GAT ACC GAA AAA TCC GAA TTC TTG C.CA GGG TTC 
Ser T'/r Lys Ala Glu Al^o Thr Glu Lys Ser Glu Ptie Leu Pro Giy Phe 
195 200 205 

GAG CGT GTA GAG GAT GCG TCG TCG TTC CCA TTG GAT TAT GGT ATC CGG 
Giu Arq Val Giu Asp Ala Ser Ser Phe Pro Leu Asp Tyr Gly lie arq 
210 215 220 

CGG GTT GAC CAC GCC GTG GGA AAC GTT CCT GAG CT*: GGT CCG GCT TTA 
Arg Leu Asp His Ala Vai Gly Asn Vai Pro Glu Leu Giy Pro Aia Leu 
225 230 235 

ACT TAT GTA GCG GGG TTC ACT GGT TTT CAC CAA TTC GCA GAG TTC ACA 
Thr Tvr Vai Aia Gly Phe Thr Glv Phe hi:i Gin P^:e Ala Glu Phe Thr 
2i0 245 2^0 

GCA GAC GAC GTT GGA ACC GCC GAG AGC GGT TTA A.^^T TCA GCG GTC CTG 
Aia Aso Asp Val Gly Thr Ala Glu Ser Gly Leu Asn Ser Ala Vai Leu 
255 * 260 265 270 

GCT AGC AAT GAT GAA ATG GTT CTT CTA CCG ATT r-J\'-: GAG CCA GTG CAC 
Ala Ser Asn Asp Giu Met Vai Leu Leu Pro lit Asn Giu Pro Val His 
275 280 285 

GGA ACA P^J^.G AGG AAG AGT CAG ATT CAG ACG TAT TTG GAA CAT yV'.C GP-Jk 
Gly Lys Arq Lys Ser Gin He Gin Thr '[vj Loii Giu His Asr. Giu 

290 295 300 

GGC GCA GGG CTA CAA CAT CTG GCT CTG ATG AGT G.--A GAC ATA TTC AGG 
Gly A^a Gly Leu Gin His Leu Ala Leu Met Sor Glu Asp lie Phe Arg 
305 310 315 

ACC CTG AGA GAG ATG AGG AAG AGG AGC AGT ATT GGA GGA TTC GAC ' TTC 
Thr Lou Arg Glu Met Arg Lys Arg Scr Ser lie Gly G.i y Phe Asp Phe 
320 325 330 

ATG CCT TCT CCT CCG CCT ACT TAG TAC CAG i^J^T CTC r-JkG I^JKA CGG GTC 
Met ^-o Ser Pro Pro Pro Thr Tyr Tyr Gin Asn Leu Lys Lys Arg Val 
335 340 345 350 

GGC GAC GTG CTC AGC GAT GAT CAG ATC .AAG GAG TGT GAG GAA TTA GGG 
Giy Aso Vai Leu Ser Asp Asp Gin lie Lys Glu Cys Giu Giu Leu Gly 
355 360 365 

ATT CTT GTA GAC AGA GAT GAT CA.2. GGG ACG TTG CTT CAA ATC TTC ACA 
lie Leu Val Asp Arg Asd Asp Gin Gly Thr Lou Leu G I ri l ie Phe Thr 
370 " 375 380 
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AAA CCA CTA GGT GAG AGG CCG ACG ATA TTT ATA GAG ATA ATC CAG AGA 1202 
Lys Pro Leu Glv. Asp Arq Pro Thr lie Phc lie Giu He He Gin Ara 
335 ^ ^ 390 395 

GTA GGA TGC ATG ATG AAA GAT GAG GAA GGG AAG GCT TAG CAG AGT GGA 12 50 
Vai Giy Cys Met Met Lys Asp Glu Giu Gly Lys Ala Tyr Gin Ser Gly 
AOQ ' ' 405 ' " 410 

GGA TGT GGT GGT TTT GGC AAA GGG AAT TTC TCT GAG CTC TTC AAG TCC 12 93 
Gly Cys Gly Gly Phe Giy Lys Gly Asn Phe Ser Glu Leu Phe Lys Ser 
416 420 425 430 

ATT GAA G?J\ TAG GAA AAG ACT CTT GAA GCC A^A CAG TTA GTG GGA TGA 134G 
He Glu Glu Tyr Glu Lys Thr Leu Glu Ala Lys Gin Leu Vai Gly 
435 440 445 

ACAAGAAGAA GAJ\CCAACTA. AAGGATTGTG TAATTAATGT AAAJ^CTGTTT TATCTTATCA 14 00 

AAACAATGTA TACAACATCT CATTTAAAAA CGAGATCAA.T CC 14 48 

;2) INFORMATION FOR 5EQ ID MO: 15; 

(i) SEQUENCL CHARACTERISTICS: 

(A; LEMGTii : 446 a.TiiriO acids 
{ B ) TYPE: a m i n o a c i :i 
(D) TOPOLOGY: linear 

iii) MOLECULE TYPE: provieiri 

(xi) SEQUENCE DESCRI PTIO:J : SEQ ID NO : 1 5 : 

Met Giy His Gin Asn Ala Ala Vai Ser Glu Asn Gl:: Asn iiis Asp Asp 
1^5 10 15 

Gly A.la Ala Ser Ser Pro Glv Phe Lvs Leu Vai Gly Phe Ser Lys Phe 
20 ' 25 30 

Vai Arg Lys Asn Pro Lys Ser Asr; L\-s Phe Lys Vai Lys Ara Phe His 
35 40 ' 45 ^ 

His He Glu Phe Tro Cvs Glv Aso Ala Thr Asn Vai Ala Arg Arn Phe 
50 55 ' 60 

Ser Trp Gly Leu Giy Met Arq Phe Ser Ala Lys Ser Asp Le\: Ser Thr 
6 5 ' ' 7 0 7 5 ' 8 0 

Gly Asn Met Vai His Ala Ser Tyr Leu Leu Thr Ser Giy Asr^ Leu Arq 
8 5 90 " 95 

Phe Leu Phe Thr Ala Pro Tvr Ser Pro Ser Leu Ser Ala Gly Glu He 
100 " ' 105 HO 

Lys Pro Thr Thr Thr Ala Ser He Pro Ser Phe Asp His Giy' Ser Cys 
115 120 125 

Arg Ser Phe Phe Ser Ser His Gly Leu Glv Vai Arg Ala Vai Ala He 
130 135 140 

Giu Vai Giu Asp Ala Glu Ser Ala Phe Ser He Ser Vai Ala Asn Glv 
145 150 155 160 

Ala He Pro Ser Ser Pro Pro He Vai Leu Asn Glu Ala Vai Thr He 
165 170 J75 

Ala Glu Vai Lys Leu Tyr Gly Asp Vai Vai Leu Ara Tyr Vai Ser Tyr 
180 185 . 190 
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Lys Aia Giu Asp Thr Glu Lys Ser Giu Phe Leu Pro Gly Phe Giu Arg 
195 200 205 

Val Glu Asp Aia Ser Ser Phe Pro Leu Asp Tyr Giy lie Arc Arg Leu 
210 215 220 

Asp His Aia Vai Gly Asn Vai Pro Giu Leu Gly Pro Aia Leu Thr Tyr 
225 230 235 240 

Val Aia Gly Phe Thr Giy Phe His Gin Phe Ala Giu Phe Thr Ala Asp 
245 250 255 

Asp Vai Gly Thr Aia Giu Ser Gly Leu Asn Ser Aia Vai Leu Ala Ser 
260 265 270 

Asn AsD Giu Met Vai Leu Leu Pro lie Asn Glu Pro Val His Gly Thr 
275 280 285 

Lys Arg Lys Ser Gin lie Gin Thr Tyr Leu Giu [i^s Asn CIu Giy Aia 
290 295 - 300 

Gly Leu Gin His Leu Ala Leu Met Ser Glu Asc lie Ph.e Arg Thr Leu 
305 310 3r^> ' 320 

Arg Giu Met Ara Lvs Arq Ser Ser lie Gly Giv ?hc Asp Ph-:^ Met Pro 
325 330 335 

Ser Pro Pro Pro Thr Tyr Tyr Gin Asn Leu Lys Lys Arq Val Gly Asp 
340 ^ 345 ^ 350 

Vai Leu Ser Asp Asp Gin lie Lys Glu Cys GJ.u Giu Leu Gly lie Leu 
355 360 365 

Vai Asp Arg Asp Aso Gin Giv Thr Leu Leu Gin lie Phe Thr Lys Pro 
370 375 380 

Leu Giv Asp Arq Pro Thr lie Phe He Glu He lie Gin Arq Vai Gly 
385 ' ' 390 395 400 

Cys Met Met Lys Asp Glu Glu Gly Lys Aia Tyr Gin Ser Giy Gly Cys 
AOS . A 10 4 15 

Giy Giy Phe Giv Lvs Gly Asn Phe Ser Giu Leu f-he Lys Ser lie Giu 
420 ' 425 430 

Glu Tyr Glu Lys Thr Leu Glu Aia Lys Gin Leu Vai Gly 
435 440 445 

(2) INFORMATION FOR SEQ ID NO : 1 6 : 

(i) SEQUENCE CHAR.z\CTERI ST ICS : 

(A) LENGTH: 513 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 
(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Vernonia qalamenensis 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: vsi . pkOO 1 5 . b2 
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SEQUENCE DESCRIPTION: 



SEQ ID NO: 16: 



CCACACCGAT TGCCGGAACT TCACCGCCTC TCACGGCCTT GCAGTCCGAG C^^J^.TCGCCAT 6C 

TGAAGTCGAT GACGCCGAAT TAGCTTTCTC CGTCAGCGTC TCTCACGGCG CTAAACCCTC 120 

CGCTGCTCCT GTAACCCTTG GAAACAACGA CGTCGTATTG TCTGAAGTTA AGCTTTACGG 180 

CGATGTCGCT TTCCGGTACA TAAGTTACAA .AAATCCGA,AC TATACATCTT CCTTTTTGCC 24 0 

CGGGTTCGAG CCCGTTGAAA AGACGTCGTC GTTTTATGAC CTTGACTACG GTATCCGCCG 300 

TTTGGACCAC GCCGTAGGNA ACGTCCCTGA GCTTGCTTCG GCAGTGGACT ACGTGAAATC 3 60 

ATTCACCGGA TTCCATGAGT TCGCCGAATT CACCGCGGAG GACGTCGGGA CGAGCGAGAG 4 20 

GGAACTGAAT TCGGTCGTTT TAGCTTGCAA CAGTGAGATG GTCTTGATTC CGATGAACGA 4 80 

GCCGGTGTAC GGAANAAAAG GAAGNAGCCA GAT 513 
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INDICATIONS RELATING TO A DEPOSITED MiCROoAClISlStt; 

(PCTRuie \2bis) 



A, The indications made below relate to the microorganism referred lo in the description 
on page 5 . Hne ] 



B. IDENTIFICATION OF DEPOSIT 



Further deposits arc identified on an additional sheet [ [ 



Name of depositary institution 
AMERICAN TYPE CULTURE COLLECTION 



Address of depositary institution (including postai code and country) 
12301 Parklavm Drive 
Rockville, Maryland 20852 
US 



Date of deposit 

25 June 1996 (25.06.96) 



Accession Number 
98083 



C. ADDITIONAL INDICATIONS (leave blank if not applicable) 



This information is continued on an additional sheet 



In respect of those designations iu which a European patent is sought, 
a sample of the deposited microorganism will be made available until 
the publication of the mention of the grant of the European patent or 
until the date on which the application has been refused or withdrawn 
or is deemed to be withdrawn, only by the issue of such a sample to an 
expert nominated by the person requesting the sample. (Rule 28(A) EPC) 



D. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE (if the indications are not for all designated States) 



E. SEPARATE FURNISHING OF INDICATIONS (leave blank if not applicable) 



The indications listed below will be submitted to the International Bureau later (xfyecify the general nature of the indicaiioivc c.f:.. "Accession 
Number of Deposit") 



For receiving Office use only 



This sheet was received with the international application 




I'or International Bureau use only 



[ I This sheet was received by the Intcrnalioiinl Bureau on: 



Authorized officer 



ForT7n'CT/RO/!34 (July 1992) 
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INDICATIONS RELATING TO A DEPOSITED MICROORGANISM 

(PCX Rule \2bis) 



A. I'hc indications made below relate to the microorganism referred to in the description 

on page ^ , line 1 ^ 

B. IDENTIFICATION OF DEPOSIT Further deposits arc identified on an addiiionni sheet [ [ 

Name of depositary institution 
AMERICAN TYPE CULTURE COLLECTION 



Address of depositaiy institution (including postal code and country) 
12301 Parklaum Drive 
Rockville, Maryland 20852 
US 



D.ntc of deposit 

25 June 1996 (25,06-96) 


Accession Number 
97622 


C. ADDITIONAL INDICATIONS (leave blank if not applicable) This information is continued on an additional sheet ||p 



In respect of those designations in which a European patent is sought, 
a sample of the deposited microorganism will be made available until 
the publication of the mention of the grant of the European patent or 
until the- date on which the application has been refused or withdrawn 
or is deemed to be withdrawn, only by the issue of such a sample to an 
expert nominated by the person requesting the sample. (Rule 28(A) EPC) 



D. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE (if the. indications are not for ail designated States) 



E. SEPARATE FURNISHING OF INDICATIONS (leave blank if not applicable) 



The indications listed below will be submitted to ihe International Bureau later (sfxcify the general nature of the indications e.g.. "Accession 
Number of Deposit'*) 



For receiving OfTicc use only 



This sheet was received with the international application 



Authprizea officer \ 



For International Bureau use only 



1 I This sheet was received by the International Hureau on: 



Authorized officer 



l-orm PCT/RO/134 (July 1992) 
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INDICATIONS RELATING TO A DEPOSITED MICROORGANISM 

(PCX Rule \2bis) 



A. The indications made below relate lo the microorganism referred to in the description 

on page ^ . line }i 

B. IDENTIFICATION OF DEPOSIT Further deposits are identified on an additional sheet 
Name of depositary institution 

AMERICAN TYPE CULTURE COLLECTION 



Address of depositary institution (including postal code and country) 
12301 Parklawn Drive 
Rockville, Maryland 20852 
US 



D«iic of deposit 


Accession Number 


12 June 1997 


209120 



C. ADDITIONAL INDICATIONS (leave blank if not applicable) This information is continued on an additional sheet 



In respect of those designations in which a European patent is sought, 
a sample of the deposited microorganism will be made available until 
the publication of the mention of the grant of the European patent or 
until the* date on which the application has been refused or withdrawn 
or is deemed to be withdrawn, only by the issue of such a sample to an 
expert nominated by the person requesting the sample. (Rule 28(4) EPC) 



D. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE (if the indications are not for all designated States) 



E. SEPARATE FURNISHING OF INDICATIONS (leave blank if not applicable) 



The indications listed below will be submitted to the International Bureau later (specify tlie general nature of the iftdicattom eg . "Accession 
Number of Deposit") 



I'm* For receiving Office use only 




t was received with the international application 



For International Bureau use only 



I I This sheet was received by the International Bureau on: 



Authorized officer 



Form PCT/RO/134 (July 1992) 
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CLAIMS 

I . An isolated nucleic acid fragmeni encoding a plant p-hydroxy- 
phenylpyruvate dioxygenase enzj'me, the fragment comprising a nucleotide 
sequence selected from the group consisting of 
5 nucleotide sequences encoding a polypeptide comprising the amino 

acid sequences set forth in SEQ ID NO:3, SEQ ID NO: 1 I . SEQ ID 
NO: 1 3, and SEQ ID NO: 1 5 and 

modified nucleotide sequences essentially similar to the nucleotide 
sequences of SEQ ID NO:2, SEQ ID NO 10, SRQ ID NO: 12 and 
10 SEQ ID NO: 14 containing deletions, insertions, or substitutions in 

the sequence that do not affect the functional propenies of the 
encoded protein. 

2. An isolated nucleic acid fragment encoding a plant /^-hydroxyphenyl- 
pyruvate dioxygenase enzyme, the fragment comprising a nucleotide sequence as 

15 set forth m SEQ ID NO: 14. 

3. A chimeric gene comprising the nucleic acid fragment of Claims i or 
2 operably linked to at least one suitable regulatory sequence. 

4. The chimeric gene of Claim 3 wherein at least one suitable regulatory 
sequence directs gene expression in a microorganism. 

20 5. The chimeric gene of Claim 3 wherein the at least one suitable 

regulatory sequence directs gene expression in a plant. 

6. A plasmid vector comprising the nucleic acid fragment of Claims 1 or 
2 operablylinked to at least one suitable regulatory sequence. 

7. A transformed host cell comprising a host cell and the plasmid' vector 
25 of Claim 6. 

8. The transformed host cell of Claim 7 wherein the host cell is derived 
tVom a plant or is a microorganism, 

9. The transformed host cell of Claim 8 wherein the microorganism is 
E, coll. 

30 ^0. A transformed plant tolerant to contact with at least one compound 

that inhibits the rate of the reaction of /^hydroxyphenyipyruvate dioxygenase 
enzyme in a non-transformed plant, the transformed plant comprising the chimeric 
gene of Claim 3 and a host plant. 

1 1 . The transformed plant of Claim 1 0 wherein the host plant is a cereal 
35 crop plant. 

12. A method to identify a compound useful for its ability to inhibit the 
rate of the reaction of /7-hydroxyphenylpyruvate dioxygenase enzyme comprising: 

(a) transforming a host cell with the plasmid vector of Claim 6: 
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(b) facilitating expression of the nucleic acid fragmeni encoding ihe 
plant /?-hydroxyphcnylpyruvale dioxygenase enzyme; 

(c) contacting the expressed enzyme from step (b) with a test 
compound: and 

5 (d) evaluating the capacity of the test compound to inliibit the rate of 

the reaction of/7-hydroxyphenylpyruvate dioxygenase enzyme. 

13. The method of Claim 12 wherein evaluating the capacity of the test 
compound to inhibit the rate of the reaction of ;;-hydroxypheny [pyruvate 
dioxygenase enzyme is accomplished by measuring oxygen utilization, carbon 

1 0 dioxide release, homogentisate production, loss of /?-hydroxyphenylpyruvate or 
maleylacetoacetate production. 

14. The method of Claim 12 wherein the transformed host cell is an 

£ coli that comprises a chimeric gene encoding a plant /:?-hydroxyphenylpyruvatc 
dioxygenase enzyme. 
15 15. A compound that inhibits the activit>' of a plant />h>'droxyphenyl- 

pyruvate dioxygenase enzyme, the compound identified by the method of 
Claim 14. 

16. A method for impaning tolerance to a plant to at least one compound 
that inhibits the rate of reaction of /7-hydroxyphenylpyruvate dioxygenase enzyme 
20 comprising; 

(a) transforming a host plant cell with a ciiimeric gene comprising a 
nucleic acid fragment encoding plant /?-hydroxyphenylpyruvatc 
dioxygenase, and 

(b) expressing the chimeric gene in an amount effective to render 
25 the transformed plant substantially tolerant to the at least one 

compound that inhibits the rate of reaction of />hydroxyphenyl- 
pyruvate dioxygenase. 
1 7. A method for the microbial production of active plant /7-hydroxy- 
phenylpyruvate dioxygenase enzyme comprising: 
30 (a) stably transforming a niicrooruanism with the chimeric uene ol 

Claim 4 encoding the plant p-hydroxyphenylpyruvate 
dioxygenase; 

(b) facilitating expression by the chimeric gene for a suitable period; 
and 

35 (c) recovering active plant /?-hydroxyphenylpyruvate dioxygenase 

enzyme. 

1 8. A method to overexprcss /7-hydroxyphcnylpyruvate dioxygenase 
enzyme in a plant comprising: 
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(a) stably transforming a host plant cell with a chimeric DNA 
molecule comprising at least one copy of a suitable promoter to 
drive expression of an associated coding sequence in a plant cell 
operably linked to at least one copy of a homologous or 

5 heterologous coding sequence encoding /7-hydroxyphenyi- 

pyruvate dioxygenasc; and 

(b) growing the transformed host plant cell of step (a). 

19. The method of Claim 18 wherein the chimeric DNA molecule is the 
chimeric gene of Claim 5. 
1 0 20. An isolated nucleic acid fragment comprising a member selected from 

the group consisting of: 

(a) an isolated nucleic acid fragment as set forth in SEQ ID NO: 1 6; 

(b) an isolated nucleic acid fragment that is essentially similar to an 
isolated nucleic acid fragmcni as set forth in SEQ ID NO: 1 6; 

15 and 

(c) an isolated nucleic acid fragment that is complementary to (a) or 
(b). 
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1/6 

FIG.1 



1 CAAGAAACGNGTCGNCGACGTGCTCAGCGATGATCAGATCAAGGAGTGTGAGGAATTAGG 

61 GATTCTTNTAGACAGAGATGATCAAGGGACGTTNCTTCAAATCTNCACAAAACCACTAGG 

121 TGACAGGCCGACGNTATTTATAGAGATAATCCAGAGNGTAGGATGCATGATGAAAGATGT 

181 GGAAGGGANGGCTTACCAGAGTGGAGNATNTNGTGGTTTTGGCAAAGGCAATT 
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FIG. 2 

1 TGAAATCAATGGGCCACCAAAACGCCGCCGTTTCAGAGAATCAAAACCATGATGACGGCG 

61 CTGCGTCGTCGCCGGGATTCAAGCTCGTCGGATTTTCCAAGTTCGTAAGAAAGAATCCAA 

121 AGTCTGATAAATTCAAGGTTAAGCGCTTCCATCACATCGAGTTCTGGTGCGGGGACGCAA 

£;co47III 

181 CCAACGTCGCTCGTCGCTTCTCCTGGGGTCTGGGGATGAGATTCTCCGCCAAATCCGATC 

2 41 TTTCCACCGGAAACATGGTTCACGCCTCTTACCTACTCACCTCCGGTGAACTCCGATTCC 

301 TTTTCACTGCTCCTTACTCTCCGTCTCTCTCCGGCGGAGAGATTAAACCGACAACCACAG 

361 GTTCTATCCCAAGTTTCGATCACGGGXCTTGTCGGTCCTTCTTCTCTTCACATGGTCTCG 

4 21 GTGTTAGACCCGTTGCGATTGAAGTAGAAGACGCGGAGTCAGCTTTCTCCATCAGTGTAG 

4 81 CTAATGGCGCTATTCCTTCGTCGCCTCCTATCGTCCTCAATGAAGCAGTTACGATCGCTG 

541 AGGTT AAACTATACGGCGATGTTGTTCTCCGATATGTTAGTTACAAAGCAGAAGATACCG 

601 AAAAATCCGAATTCTTGCCAGGGTTCGAGCGTGTAGAGGATGCGTCGTCGTTCCCATTGG 
EcoRl 

661 ATTATGGTATCCGGCGGCTTGACCACGCCGTGGGAAACGTTCCTGAGCTTGGTCCGGCTT 

721 TAACTTATGTAGCGGGGTTCACTGGTTTTCACCAATTCGCAGAGTTCACAGCAGACGACG 

781 TTGGAACCGCCGAGAGCGGTTTAAATTCAGCGGTCCTGGCTAGCAATGATGAAATGGTTC 

Nhel 

8 41 TTCTACCGATTAACGAGCCAGTGCACGGAACAAAGAGGAAGAGTCAGATTCAGACGTATT 
901 TGGAACATAACGAAGGCGCAGGGCTACAACATCTGGCTCTGATGAGTGAAGACATATTCA 

9 61 GGACCCTGAGAGAGATGAGGAAGAGGAGCAGTATTGGAGGATTCGACTTCATGCCTTCTC 
1021 CTCCGCCTACTTACTACCAGAATCTCAAGAAACGGGTCGGCGACGTGCTCAGCGATGATC 
1081 AGATCAAGGAGTGTGAGGAATTAGGGATTCTTGTAGACAGAGATGATCAAGGGACGTTGC 
1141 TTCAAATCTTCACAAAACCACTAGGTGACAGGCCGACGATATTTATAGAGATAATCCAGA 
1201 GAGTAGGATGCATGATGAAAGATGAGGAAGGGAAGGCTTACCAGAGTGGAGGATGTGGTG 
1261 GTTTTGCCAAAGGCAATTTCTCTGAGCTCTTCAAGTCCATTGAAGAATACGAAAAGACTC 
1321 TTGAAGCCAAACAGTTAGTGGGATGAACAAGAAGAAGAACCAACTAAAGGATTGTGTAAT 
1381 TAATGTAAAACTGTTTTATCTTATCAAAACAATGTATACAACATCXCATTTAAAAACGAG 
14 41 ATCAATCC 
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FIG.3A 



Ar.nbLdoc.3is MGHQWAA75 E^/O^J H DCGA^. JSFGFEx'LVGr SKTVP.KM ? uKTK'/KRFHH 

C'.rn MPFTPiAAAA GPJ^.VP^J-JKSJK.-, EQP-^^.FRLVGH RNrVRTNPF.S CRrHTLAFHH 

VV/DKGPK? cF.GRFLHFHS 

^''^^-'^'^ M TTVNMKGPK? ERGP.FLH FH 5 

M TTV3DKGAK? ZRGRFLHFHS 

-"-^ M TSY5DKGZK? EP.GRFLHFHS 
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Arabidcpsis lEFWCGDATN VARRFSWGLG 

Corn VELWCADAAS AAGRFSFGLG 

Rat VTFWVGNAKQ AASFYCNKMG 

Mouse VTFWVGNAKQ AASFYCNKMG 

Human VTETVVGNAKQ AASFYCSKMG 

Pig VTFWVGNAKQ AASYYCSKIG 



iOO 

MRFSAKSDLS TGNMVHASYL LTSGDLRFLF 
APLAARSDLS TGNSAHASLL LRSGSLSFLF 
FEPLAYKGLE TGSREVVSHV IKQGKIVFVL 
FEPLAYRGLE TGSREVVSHV IKRGKIVFVL 
FEPLAYRGLE TGSREVVSHV IKQGKIVFVL 
FEPLAYKGLE TGSREVVSHV VKQDKIVPVF 



Arabidopsis TAPYSPSLSA GEIKPTTTAS IPSFDHGSCR SFFSSHGLGV RAVAIEVEDA 



Corn TAPYAHGADA ATAA LPSFSAAAAR RFAADHGLAV RAVALRVADA 

Rat CSALNPW NKEMG DHLVKHGDGV KDIAFEVEDC 

Mouse CSALNPW NKEMG DHLVKHGDGV KDIAFEVEDC 

Human SSALNPW NKEMG DHLVKHGDGV KDIAFEVEDC 

Pig SSALNPW NKEMG DHLVKHGDGV KDIAFEVEDC 



151 

Arabidopsis ESAFSISVAN GAIPSSPPIV 

Corn EDAFRASVAA GARPAFGPVD 

Rat EHIVQKARER GAKIVREPWV 

Mouse DHIVQKARER GAKIVREPWV 

Human DYIVQKARER GAKIMREPWV 

Pig DYIVQKARER GAIIVREPWI 



200 

LNEAVTIAEV KLYGDVVLRY VSYKAEDTEK 

LGRGFRLAEV ELYGDVVLRY VSY . PDGAAG 

EEDKFGKVKF AVLQTYGDTT HTLVEKINYT 

EQDKFGKVKF AVLQTYGDTT HTLVEKINYT 

EQDKFGKVKF AVLQTYGDTT HTLVEKMNYI 

EQDKFGKVKF AVLQTFGDTT HTLVEKMNYT 



201 

Arabidopsis SEFLPGFER, ..VEDASSFP 

Corn EPFLPGFEG. ..V..ASPGA 

Rat GRFLPGFEAP TYKDTLLPKL 

Mouse GRFLPGFEAP TYKDTLLPKL 

Human GQFLPGYEPP AFMDPLLPKL 

Pig GCFLPGFEAP TFTDPLLSKL 



251 

Arabidopsis TGFHQFAEFT ADDVGTAESG 

Corn' TGFHEFAEFT TEDVGTAESG 

Rat LQFHRFWSVD DTQVHTEYSS 

Mouse LQFHRFWSVD DTQVHTEYSS 

Human LQFHRFWSVD DTQVHTEYSS 

Pig LQFHRFWSVD DTQIHTEYSA 



250 

LDYGIRRLDH AVGNVP..EL GPALTYVAGF 
ADYGLSRFDH IVGNVP..EL APAAAYFAGF 
PSCNLEIIDH IVGNQPDQEM ESASEWYLKN 
PRCNLEIIDH IVGNQPDQEM QSASEWYLKN 
PKCSLEMIDH IVGNQPDQEM VSASEWYLKN 
PKCGLEIIDH IVGNQPDQEM ESASQWYMRN 



300 

LNSAVLASND EMVLLPINEP VHGTKRKSQI 

LNSMVLANNS ENVLLPLNEF VHGTKRRSQI 

LRSIVVANYE ESIKMPINEP APG.RKKSQI 

LRSIVVTNYE ESIKMPINEP APG.RKKSQI 

LRSIVVANYE ESIKMPINEP APG.KKKSQI 

LRSVVMANYE ESIKMPINEP APG.KKKSQI 
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FIG.3B 



301 

Arabidopsis QTYLEHNEGA GLQHLALMSE 

Corn QTFLDHHGGP GVQHMALASD 

Rat QEYVDYNGGA GVQHIALRTE 

Mouse QEYVDYNGGA GVQHIALKTE 

Human QEYVDYNGGA GVQHIALKTE 

Pig QEYVDYNGGA GVQHIALKTE 
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DIFRTLREMR KRSSIGGFDF MFSPPPTYYQ 
DVLRTLREMQ ARSAMGGFEF MAPPTSDYYD 
DIITTIRHLR ER....GMEF LAVP.SSYYR 
DIITAIRHLR ER....GTEF L-V\P.SSYYK 
DIITAIRHLR ER....GLEF LSVP.STYYK 
DIITAIRSLR ER....GVEF LAVP.FTYYK 



Ar-3bidoc3is 
Corn 
Rat 
Mouse 
Human 
Pig 



351 

NLKK. . RVGD 
GVRR. , RAGD 
LLRENLKTSK 
LLRENLKSAK 
QLREKLKTAK 
QLQEKLKSAK 



VLSDDQIKEC 
VLTEAQIKEC 
IQVKENMDVL 
IQVKESMDVL 
IKVKENIDAL 
IRVKESIDVL 



EELGILVDRD 
QELGVLVDRD 
EELKILVDVD 
EELHILVDVD 
EELKILVDYD 
EELKILVDYD 



DQGTLLQIFT 
DQGVLLQIFT 
EKG iLLQIFT 
EKGYLLQIFT 
EKGYLLQIFT 
EKGYLLQIFT 



4 00 

KPLGDRPTIF 
KPVGDRPTLF 
KPMQDRPTLF 
KPMQDRPTLF 
KPVQDRPTLF 
KPMQDRPTVF 



Arabidopsis 
Corn 
Rat 
Mouse 
Human 
Pig 



401 

lEIIQRVGCM MKDEEGKAYQ SGGCGGFGKG NFSELFKSIE EYEKTLEAKO 

^^5^ooiS^^ EKDEKGQEYQ KGGCGGFGKG NFSQLFKSIE DYEKSLEAKQ 
lKiSrhnho ^^^^^ NFNSLFKAFE E . EQALRG 

LEViShS ^'■^''^ NFNSLFKAFE E . EQALRGNL 

^EvJSho "^^^""^ NFNSLFKAFE E . EQNLRGNL 
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